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Multitarget  Moments  and  Their 
Application  to  Multitarget  Tracking 

Ronald  Mahler 

Lockheed  Martin  Tactical  Systems,  MS  U2S25,  3333  Pilot  Knob  Rd,  Eagan  MN  55121,  USA 


ABSTRACT 

The  concept  of  a  ’’statistical  moment”  has  played  a  fundamental  role  in  practical  single-target  tracking.  The  optimal 
tracking  approach,  the  recursive  Bayes  filter,  propagates  the  entire  posterior  density  through  time.  Because  this  filter 
is  computationally  daunting,  most  practical  single-target  tracking  approaches  assume  that  signal-to-noise  ratio  is 
large  enough  that  the  posterior  is  approximately  characterized  by  its  low-order  moments.  For  example,  the  alpha- 
beta-gamma  filter  propagates  the  first-order  moment  (the  posterior  expectation)  whereas  the  extended  Kalman  filter 
(EKF)  additionally  propagates  a  second-order  moment  (the  posterior  covariance).  Until  recently,  the  possibility 
of  an  analogous  multitarget  approach  seems  to  have  been  ignored— apparently  for  lack  of  a  systematic  statistical 
foundation  for  multitarget  problems.  In  two  recent  papers,  I  introduced  multitarget  moment  statistics  of  arbitrary 
order  and  developed  a  Bayes  filtering  theory  for  the  first-order  multitarget  moment,  the  ’’probability  hypothesis 
density  (PHD).”  In  this  paper  I  continue  this  line  of  investigation.  I  will  describe  a  preliminary  implementation  of 
the  first-order  filter,  I  will  introduce  the  concept  of  multitarget  posterior  covariances  of  arbitrary  order.  Using  them 
I  will  show  how  a  suitable  extension  of  the  finite-set  statistics  (HSST)  multisensor-multitarget  differential  calculus 
can  be  used  to  construct  multitarget  statistical  analogs  of  the  EKF. 

1.  INTRODUCTION 

Progress  in  single-target  tracking  has  relied  upon  the  fundamental  concept  of  the  moment  statistics  of  a  random 
track-vector.  The  optimal  approach  to  single-target  tracking,  the  recursive  Bayes  filter,  propagates  the  entire  pos¬ 
terior  density  through  time.  Because  real-time  implementation  of  the  Bayes  filter  is  extremely  challenging,  most 
practical  single-target  tracking  approaches  assume  that  SNR  is  large  enough  that  the  posterior  is  approximately 
characterized  by  its  low-order  moments.  For  example,  the  alpha-beta-gamma  filter  propagates  the  first-order  mo¬ 
ment  (the  posterior  expectation)  whereas  the  extended  Kalman  filter  (EKF)  additionally  propagates  a  second-order 
moment  (the  posterior  covariance). 

Until  recently,  the  possibility  of  an  analogous  approach  has  apparently  been  overlooked  as  a  multitarget  tracking 
strategy.  The  reason  for  this  has  been,  apparently,  the  lack  of  a  systematic  engineering  statistics  for  multitarget  prob¬ 
lems  Even  more  surprisingly,  this  has  been  the  case  even  though  a  rigorous  statistical  foundation  for  multi-object 
problems,  called  point  process  theory,  has  been  in  existence  for  decades.  Specifically:  How  would  one  even  go 
about  defining  first-  and  second-order  multitarget  moment  statistics,  let  alone  developing  a  rigorous  Bayes  filtering 
theoiy  for  them?  As  we  shall  see  in  a  moment,  naive  approaches  to  defining  such  statistics  fail.  Consequently,  one 
must  turn  to  more  systematic,  theoretically-grounded  approaches. 

In  two  recent  papers  [27],  [17],  I  resumed  a  line  of  investigation  initiated  in  the  book  Mathematics  of  Data 
Fusion  [9,  pp.  168-170].  I  introduced  multitarget  moment  statistics  of  arbitrary  order  called  ’’factorial-moment 
densities”  and,  m  particular,  the  first-order  moment  also  known  as  the  ’’Probability  Hypothesis  Density”  or  ’’PHD”. 

I  developed  a  Bayes  filter  for  the  PHD,  using  a  random-set  formulation  of  point  process  theory  called  ’’finite-set 
statistics  (FISST).”  I  showed  that  this  first-moment  filter  is  a  multitarget  statistical  analog  of  the  alpha-beta-gamma 
filter.  That  is,  it  is  based  on  the  assumption  that  signal-to-noise  ratio  is  so  high  that  the  multitarget  posterior  density 
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is  approximately  characterized  by  its  PHD.  (Equivalently,  it  is  based  on  the  assumption  that  the  multitarget  track- 
process  is  approximately  Poisson.)  In  this  paper  I  continue  this  line  of  investigation.  I  will  describe  a  preliminary 
implementation  of  the  PHD  filter.  I  will  introduce  the  concept  of  multitarget  posterior  covariances  of  arbitrary  order, 
and  show  how  the  FISST  multi-target  differential  and  integral  calculus  can  be  used  to  construct  these  moments.  I 
will  also  show  how  a  suitable  extension  of  this  calculus  can  potentially  be  used  to  construct  multi-target  statistical 
analogs  of  the  Kalman  filter  (KF)  and  extended  Kalman  filter  (EKF). 


I 

life: 

H: 


All  of  this,  in  turn,  depends  on  a  firm  grounding  in  point  process  theory.  Progress  in  single-sensor,  single-object 
tracking  has  also  been  greatly  facilitated  by  the  existence  of  a  systematic,  rigorous,  and  yet  practical  engineering 
statistics  that  supports  the  development  of  new  concepts  in  this  field.  By  ’’engineering  statistics,”  I  mean  the  vast 
body  of  applied  mathematical  techniques  surrounding  the  following  ’’Statistics  101”  concepts  that  most  tracking 
engineers  learn  as  undergraduates:  random  vectors;  probability-mass  and  probability-density  functions;  differential 
and  integral  calculus;  statistical  moments  (expected  value,  covariance,  etc.);  optimal  state  estimators;  optimal  signal¬ 
processing  filters;  and  so  on.  Given  the  importance  of  such  concepts  in  the  single-sensor,  single-object  realm,  one 
would  expect  that  multisensor,  multi-object  tracking  would  already  rest  upon  a  similarly  systematic,  rigorous,  and 
yet  practical  engineering  statistics.  This  has  not  been  the  case  despite  the  existence  of  point  process  theory,  and  there 
appear  to  be  two  major  reasons  for  the  gap.  First,  theoretical  development  in  the  multitarget  tracking  community  has 
been  focused  primarily  on  immediate  engineering  applications  rather  than  on  systematic,  over-arching  foundations. 
Second,  neither  of  the  primary  mathematical  formulations  of  point  process  theory — random  measure  theory  [6], 
[15]  and  stochastic  geometry  [48],  [30] — have  been  well-suited  for  reduction  to  the  practical  ’’Statistics  101”  form 
favored  by  tracking  engineers.  This  is  partly  because  random  measure  theory  and  stochastic  geometry  both  tend  to 
be  somewhat  impenetrable  to  even  mathematically  sophisticated  engineers.*  It  is  also  partly  due  to  the  fact  that  point 
process  theory  has  been  directed  not  at  tracking  but,  rather,  towards  those  applications  which  gave  birth  to  it:  cueing 
theory,  image  signal  processing,  statistical  theories  of  gases,  liquids,  and  particles,  etc.  What  has  been  missing  has 
been  an  ’’engineering  friendly”  formulation  of  point  process  theory:  ’’finite-set  statistics”(FISST),  the  multisensor- 
multitarget  engineering  statistics  that  I  introduced  in  1994  .  FISST  is  essentially  a  judicious,  engineering-oriented 
distillation  of  point  process  and  related  concepts  drawn  from  stochastic  geometry,  random  measure  theory,  modem 
statistical  physics,  and  expert-systems  theory  (see  section  2).  FISST  is  ’’engineering  friendly”  in  that  it  is  geometric 
(it  treats  multi-object  systems  as  visualizable  random  images )  and  in  that  it  preserves  and  extends  the  "Statistics 
:  101  "formalism  that  tracking  engineers  already  understand. 

J|  1.1.  Statistical  Moments  in  Single-Target  Tracking 

??;  With  the  clarity  of  hindsight,  one  can  recognize  that  single-target  tracking  has  taken  the  following  conceptual  (if  not 
^-actual  historical)  trajectory.  If  real-time  computational  tractability  were  not  an  issue,  optimal  single-target  trackers 
be  implementations  of  the  following  Bayesian  discrete-time  recursive  nonlinear  filtering  equations  [13],  [42], 
If  [2,  pp.  373-377],  [14,  p.  174]: 


iSp 

vhere 


/fc-t-i|fc(xfc+i|Z*) 

/fc+i|fc+i(xfc+il^+1) 

xfc|fc 


J /fc+i|fc(xfc+ilxfc)  fk\k(*k\Zk)dxk 
/fc(Zfe+l lxfe+l)  fk+l\k(*-k+l\Zk) 

fk+l(zk+l\Zk) 

argsup  /fc|fc(x|Zfc) 


(1) 

(2) 

(3) 


IMS*  target  state  variable  at  time-step  k  and  zk  is  the  observed  measurement  at  time-step  k; 

at  the  1998  GTRI/ONR  Workshop  on  Tracking  and  Sensor  Fusion  [41],  A.  Skorokhod  gave  a  presentation  on 
■gjgure-based  multitarget  tracking  [38]  that  does  not  seem  to  have  been  well  understood. 
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(2)  fk\k^k\Zk)  is  the  Bayes  posterior  distribution  conditioned  on  the  data-stream  Zk  =  {zi,  ...,zk}; 

(3)  /*(z|x)  is  the  sensor  likelihood  function  at  time-step  k; 

(4)  /fc+i|fc(xfc+i|xfc)  is  the  target  Markov  transition  density  that  models  between-measurements  target  motion; 

(5)  /fc+i  |fc(xfc+i|Zfc)  is  the  time-prediction  of  the  posterior  fk\k(yik\Zk)  to  time-step  k  +  1; 

(6)  fk+1(zk+1\Zk)  =  f  fk(zk+1\yk+1)  fk-n\kyk+i\Zk)dyk+i  is  the  Bayes  normalization  constant;  and 

(7)  Xfc|*;  is  the  Bayes-optimal  maximum  a  posteriori  (MAP)  estimate  of  the  target  state  at  time-step  k. 

Real-time  computational  implementation  of  equations  1,  2,  and  3  is  extraordinarily  difficult  in  general.  So, 
instead,  a  historically  important  strategy  has  been  to  assume  that  signal-to-noise  ratio  is  high  enough  that  the  higher- 
order  moments  of  the  evolving  track- vector  ±k,k  can  be  neglected.  In  this  case,  the  first-  and  second-order  moment 
(i.e.,  the  posterior  expectation  vector  and  the  posterior  second-moment  matrix) 

=  J *  fk\k(x\Zk)dxi,  Mk\k  =  y  xxT  /fc|fc(x|Zfc)dx  (4) 

are  always  approximate  sufficient  statistics: 

fk\k(?*\Zk)  —  /fc|fc(x|xfc|fc,Mfc|fc)  =  Np^k{x-±k\k) 

where  ”T”  denotes  matrix  transpose  and  where  Npk^k (x  —  xfc|fc)  denotes  a  multi-variate  Gaussian  distribution  with 
covariance  matrix  Pk\k  =  Mk\k  —  x^x^.  We  can  then  propagate  xfc|fc  and  Mk\k  instead  of  the  full  distribution 
/fc|fc(x|Zfc)  using  an  extended  Kalman  filter.  If  SNR  is  so  high  that  the  second-order  moment  can  be  neglected  as 
well,  then 

fk\MZk)^fk\M*k\k)  (5) 

and  we  can  propagate  x/-|fc  alone  using  an  even  computationally  simpler  constant-gain  Kalman  filter — e.g.,  the 
a-/3- 7  filter. 


1.2.  The  Failure  of  Naive  Multitarget  Moments 

Given  such  hindsight,  someone  new  to  multitarget  tracking  might  surmise  that  the  field  has,  at  the  conceptual  if  not 
the  historical  level,  followed  a  similar  trajectory.  That  is,  he  or  she  might  presume  that  one  would  (1)  begin  with 
a  theoretically  systematic  framework,  namely  propagation  of  a  multitarget  posterior  density  fk\k{X\Z^)  using 
a  multitarget  analog  of  equations  1  and  2;  and  (2)  transcend  computational  difficulties  by  assuming  that  SNR  is 
high  enough  that  one  can  propagate  first-  and/or  second- order  moments  of  the  evolving  multitarget  track-set  Xk\k, 
instead  of  the  full  distribution  fk\k(X\ Z^).  In  reality,  multitarget  tracking  has  taken  a  quite  different  course.  The 
concept  that  multitarget  tracking  should  be  theoretically  grounded  on  a  multitarget  analog  of  equations  1  and  2  is 
apparently  itself  of  very  recent  vintage  (see  section  3.6).  The  idea  of  constructing  lower-order  statistical  moments  for 
a  multitarget  system,  and  using  them  as  the  basis  for  multi  target  tracking  algorithms,  seems  to  have  been  overlooked 
entirely.  This,  in  turn,  appears  to  be  attributable  to  the  lack  of  a  systematic  engineering  statistics  for  dealing  with 
multitarget  problems,  similar  to  the  statistics  that  has  long  been  available  for  single-target  problems. 

In  particular,  how  would  one  even  go  about  defining  first-  and  second-order  multitarget  moment  statistics?  It 
is  easy  to  see  that  a  naive  approach  fails  [22,  p.  41],  [24],  [25].  The  most  obvious  and  direct  way  of  defining  the 
first-order  moment  of  a  random  track-set  Fk\k  at  time-step  k  would  be  to  compute  its  expected  value  E[Tfc|fc]. 
Suppose  that  we  are  in  one  dimension  and  that  /*jfc(0)  is  the  probability  that  Fk\k  =  0  (no  targets),  fk\k(x)  is 
the  likelihood  of  the  event  Tfc|fc  =  {a:}  (single  track  with  state  x),  fk\k(xhx2)  is  the  likelihood  of  the  event 
Ffci*;  =  {xi,x2}  (two  targets  with  states  xi,  x2),  and  so  on.  If  the  expectation  exists,  it  must  have  the  form 


E[rfc|fc]  =  0  •  /fc|fc(0)  +  J M  -fk\k{x)dx  +  j {xux2}  ■  fk\k(xi,x2)dxidx2  +  ... 
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Such  an  integral  cannot  even  be  defined  unless,  at  minimum,  the  multitarget  state  space  is  a  vector  space  in  partic¬ 
ular,  unless  it  has  a  concept  of  addition/subtraction.  But  how  does  one  add  the  zero-target  state  0  to  a  single-target 
state  {x}l  Or  a  single-target  state  {2}  to  a  two-target  state  {21 ,  22}?  As  a  specific  example,  let: 

f  1/2  if  X  =  <2) 


fk\k(X)  =  {  \Na,(x-l)  if  *  =  {2} 

{  0  if  |X|>2 

where  ^2(2)  is  a  normal  distribution  with  a  variance  cr2  that  has  units  fcm2.  The  expectation  would  then  be 

E[r*,fc]  =  0  •  /(0)  +  J  x  f(x)dx  =  |(0  +  1  km) 

Notice  that,  at  minimum,  we  have  a  units-mismatch  problem:  we  are  asked  to  add  the  unitless  quantity  0  to  the 
quantity  1  km.  Even  if  we  assume  that  the  continuous  variable  2  is  discrete  so  that  this  problem  disappears,  we  still 
must  add  the  quantity  0  to  the  quantity  1.  If  0  +  1  =  0  then  1  =  0,  which  is  impossible.  If  0  +  1  =  1  then 
0  =  0,  so  the  same  mathematical  symbol  represents  two  different  states  (the  no-target  state  0  and  the  single-target 
state  2  =  0).  The  same  problem  occurs  if  we  define  0  -|-  a  =  ba  for  any  real  numbers  a,  ba  since  then  0  =  ba  —  a. 

These  difficulties  persist  even  if  we  assume  that  the  number  of  targets  is  known  a  priori.  For  example,  suppose 
that  we  have  a  two-target  system  with  two- target  distribution 

/fc|fc(2i,22)  =  ^iVCT2(2i  -21)^2(22-52)  +  2^2(22  ~x1)Nct2(x1  -22) 

where  the  targets  are  highly  separated:  cr2  is  small  and  21  is  much  larger  than  22.  Our  intuition  leads  us  to 
believe  that  the  multitarget  expectation  of  this  distribution  should  be  E|Tfc|fc]=  {21, 22}.  The  naive  expectation  of 
the  distribution  /fc|fc(2i,  22),  however,  yields  E[Tfc|fc]=  \  (21  +  52).  This  is  because  it  does  not  take  account  of 
the  fact  that  fk\k(x1,x2)  is  the  distribution  of  a  multitarget  system  and,  therefore,  that  its  symmetry  /fc|fc(21,22)  = 
fk\k (22  j  21 )  provides  no  information  about  this  system  other  than  the  fact  that  it  is  multitarget. 

1.3.  First-Order  Multitarget  Moments  and  Filtering 

By  a  track-valued  multitarget  expectation  I  mean  a  set  of  specific  tracks  of  the  form  E[F]=  {xfc|fc, ...,  xfc|fc}.  The 
proper  definition  of  the  track- valued  multitarget  expectation  E[Ffc|jt]  of  a  random  track-set  is  a  problem  that  currently 
resists  solution  (see  section  3.5).  As  a  result,  one  has  no  choice  but  to  construct  multitarget  moments  by  first 
I  specifying  some  function  <p  that  transforms  multitarget  state-sets  X  =  {xi,...,xn}  into  elements  <fi(X)  of 
some  suitably  well-behaved  vector  space.  The  transformation  <fi  should  be  one-to-one  and  it  should  transform 
:  set-theoretic  operations  into  corresponding  vector-algebra  operations — for  example,  h{X  U  Y)  =  h(X)  +  h(Y) 

j  whenever  X  n  Y  =  0.  In  this  case  we  can  compute  indirect  first-order  moments  of  the  form  E[h(rfc|j.)]  that  will 

jp  themselves  be  elements  of  this  vector  space.  Two  obvious  candidates  are  the  following  [9,  p.  179]: 

ip ,  '  _  h({xi,  ...,xn})  =  <5Xl  +  ...  +  SXn,  h({x i,...,xn})  —  AXl  +  ...  +  AXn 

g;. where  <5x(y)  is  the  Dirac  delta  density  concentrated  at  x  and  where  Ax  is  its  corresponding  probability-mass 

Kffion,  i.e.  the  Dirac  measure  AX(S)  =  fs  6x(y)dy  =1  (if  x  €  S)  and  =  0  (otherwise).  In  the  first 
'  E[fe(r*|fc)]  will  be  a  density  function  Dk |fc(y)  and  in  the  second  case  it  will  be  the  measure  Dk\k{S)  = 
®k\k(y)dy  corresponding  to  this  density. 

As,  it  turns  out,  Dk\k{y)  and  Dk\k(S)  have  long  been  recognized  as  useful  statistical  moments  in  point  process 
j|y;  Where  they  are  variously  known  as  the  expectation,  mean,  or first  factorial-moment  density  resp.  measure.  I 
^described  them  four  years  ago  in  the  book  Mathematics  of  Data  Fusion  [9,  p.  168-170].  There,  I  also  showed 


TJ5T 


that  Dk\k(y)  is  the  same  thing  as  the  probability  hypothesis  density  (PHD),  a  multitarget  tracking  and  evidence- 
accumulation  technique  proposed  in  1993  by  M.C.  Stein  and  C.L.  Winter  [46].  In  two  recent  papers  [27],  [17]  I 
derived  a  recursive  Bayes  filter  for  the  PHD.  This  filter  is  based  on  a  multitarget  analogy  with  equation  5.  That  is, 
signal-to-noise  ratio  is  assumed  to  be  so  high  that  the  PHD  is  an  approximate  sufficient  statistic, 

fkl„(X\Z^)  ^  hlt(X\Dm) 

and  consequently  that  we  can  propagate  Dk\k  rather  than  the  full  multitarget  posterior  distribution  fk\k(X\Z^). 
This  work  is  described  more  fully  in  section  3. 

Engineers  tend  to  react  with  puzzlement  to  the  idea  that  the  first  moment  of  a  random  multitarget  track-set  T^i k  is 

a  density  function,  namely  the  PHD  Dklk(x\Z^).  What  they  naturally  expect  to  see  is  a  track-valued  expectation. 

So,  what  exactly  is  the  PHD?  Intuitively  speaking,  just  as  the  value  of  the  probability  density  function  /x(x)  of  a 

continuous  random  vector  X  provides  a  means  of  describing  the  zero-probability  event  Pr(X  =  x),  so  the  PHD 

A;|fc(xZ(^)  provides  a  means  of  describing  the  zero-probability  event  Pr(x  €  rfc]fc).+  In  other  words,  it  is  the 

total  posterior  likelihood  that  the  multitarget  system  contains  a  target  that  has  state  x,  and  so  is  the  total  probability 

density  that  accrues  to  the  hypothesis  x  (and  hence  the  name  ’’probability  hypothesis  density”).  Consequently, 

Dk\k{AZ^)  will  tend  to  have  maxima  approximately  at  the  locations  of  the  targets.  Since  the  expected  number 

of  tracks  Nk\k  =  /Dfc|fc(x|Z(fc))dx  can  be  computed  directly  from  the  PHD,  multitarget  track-estimation  can  be 

accomplished  by  searching  for  the  [IV*]*]  highest  peaks  in  the  PHD,  where  [a;]  denotes  the  nearest-integer  function. 

♦ 

1.4.  Multitarget  Moments,  Covariances,  and  Filtering 

The  PHD  filter  summarized  in  section  1.3  makes  use  only  of  the  first-order  multitarget  moment  statistic.  It  is 
natural  to  ask  whether  second-order  statistics  can  be  applied  to  multitarget  systems.  Can  we  propagate  a  second- 
order  approximation  of  the  multitarget  posterior  density  (i.e.,  the  PHD  and  a  multitarget  covariance)  instead  of  the 
multitarget  posterior  itself?  Can  we  expand  the  multitarget  likelihood  function  and  multitarget  Markov  density  in 
some  kind  of  multitarget  Taylor’s  series”?  If  so,  we  could  potentially  produce  multitarget  analogs  of  the  Kalman 
filter  and  the  EKF. 

Another  purpose  of  this  paper  is  to  continue  the  line  of  investigation  begun  with  the  PHD  filter,  and  try  to 
determine  whether  or  not  it  is  possible  to  construct  multitarget  analogs  of  the  Kalman  filter  or  extended  Kalman 
filter  (EKF).  As  a  beginning,  in  1997  I  also  identified  multitarget  moment  statistics  of  higher  order  and  showed  that 
they  are  identical  to  well-known  point  process  moments  called  ’’factorial  moment  densities”  [9,  p.  168-170]. 

Results  in  this  direction,  summarized  in  section  4,  are  preliminary.  At  this  time,  a  computationally  tractable 
’’multitarget  Kalman  filter”  based  on  second-order  multitarget  moment  statistics  does  not  appear  to  be  imminent. 
However,  it  appears  that  it  might  be  possible  to  develop  a  multitarget  analog  of  the  EKF,  though  the  potential 
computational  practicality  of  such  a  ’’multitarget  EKF”  remains  to  be  seen.  This  work  is  reported  in  section  4. 

1.5.  Organization  of  the  Paper 

The  paper  begins,  in  section  2,  with  a  short  but  systematic  introduction  to  point  processes  (section  2.1),  finite-set 
statistics  (section  2.2),  and  the  relationship  between  the  two  (section  2.3).  Section  3  introduces  the  multitarget 
moment  density  function  (section  3.1),  the  Bayes  multitarget  first-moment  filter  (sections  3.2  and  3.1),  and  a  pre¬ 
liminary  implementation  of  this  ’’PHD  filter”  (section  3.4).  The  section  concludes  with  a  brief  discussion  of  an 
open  research  question  (how  to  define  track-valued  rather  than  function-valued  multitarget  first  moments,  section 
3.5),  and  a  summary  of  related  research  (section  3.6).  Section  4  explores  the  possibility  of  extending  this  line  of 
reasoning  to  second-order  multitarget  moments.  Possible  multitarget  analogs  of  the  Kalman  filter  and  the  EKF  are 

One  consequence  of  this  fact  is  that  the  PHD  is  the  same  thing  as  a  fuzzy  set  of  tracks  when  target  state  space  is  discrete  [9 
p.  169],  [27,  p.  108],  [17,  Section  2.5], 
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explored  in  sections  4.2  and  4.4,  respectively.  Mathematical  proofs  have  been  relegated  to  section  5;  conclusions 
may  be  found  in  section  6. 
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2.  MULTI-OBJECT  STATISTICS 

The  purpose  of  this  section  is  to  summarize  the  elements  of  point  process  theory  needed  for  this  paper.  I  describe 
point  process  theory  in  section  2.1.  In  sections  2.2  and  2.3 1  summarize  FISST  and  clarify  its  relationship  with  more 
traditional  formulations  of  point  process  theory. 

2.1.  Point  Processes  and  Functional  Calculus 

Randomly  varying,  multi-object  ensembles  are  a  basic  feature  of  multitarget  tracking.  The  ensemble  Z  of  ob¬ 
servations  collected  from  a  multitarget  system  is  a  set  of  randomly-varying  observation  vectors,  the  number  of 
which  is  itself  random.  If  we  take  a  Bayesian  viewpoint,  then  the  ensemble  X  of  target-tracks  also  is  a  set  of 
randomly-varying  objects  (state  vectors),  the  number  of  which  is  itself  random.  Point  processes  are  the  foundation 
for  representing  stochastic  systems  that,  like  these  examples,  consist  of  randomly  varying  objects  that  randomly  vary 
in  number.  The  purpose  of  this  section  is  to  provide  a  short  summary  of  this  theory.  This  summary  is  nonstandard, 
however,  in  that  it  borrows  from  and  integrates  the  practice  not  only  of  statisticians  and  stochastic  geometers,  but 
also  of  physicists  and  of  expert- systems  theorists  (most  notably,  I.R.  Goodman). 

In  this  section  I  define  point  processes  (section  2.1.1)  and  two  other  basic  concepts:  the  probability  generating 
functional  (section  2. 1 .2)  and  the  functional  derivative  (section  2.1.3).  Then  I  define  the  basic  statistical  descriptors 
of  point  processes:  Janossy  densities  (the  multi-object  analogs  of  probability  densities;  section  2. 1 .4),  factorial 
moment  densities  (the  multi-object  analogs  of  moments;  section  2.1.5),  and  factorial  cumulant  densities  (the  multi- 
object  analogs  of  covariances;  section  2.1.6). 

2.1.1.  Simple  point  processes  (finite  random  sets) 

There  are  many  ways  of  defining  point  processes.  Let  A y(5)  be  the  ’’Dirac  measure,”  defined  as  Ay  (S)  =  1  if 
y  €  5  and  Ay(S) = 0  otherwise,  where  S  is  some  space  O  of  objects  (e.g.,  states  or  measurements).  Let  <Sy(x) 
denote  the  density  function  of  Ay(S),  i.e.  the  Dirac  delta  function  concentrated  at  the  point  y.  Finally,  let  T  be  a 
randomly  varying  finite  set  of  objects  and  <5r(x)  =  £yer  Sy(x)  the  random  density  function  which  consists  of  the 
sum  of  the  Dirac  delta  functions  concentrated  on  these  objects.  Then 


Nr(S)  =  Y^ Ay(S)=  f  5r(x)dx  =  |rnSj 

yer 


(6) 


is  called  a  ’’random  counting  measure”  because  it  counts  the  randomly  varying  number  |r  n  S\  of  objects  in  T  that 
are  contained  in  the  region  SCO.  All  of  the  three  following  items  have  been  known  for  decades  [48,  p.  100-102], 
[1],  [39],  [36]  to  be  equivalent  and  interchangeable  definitions  of  a  (simple)  multi-dimensional  point  process  : 

1.  the  random  finite  subset  T;§ 

2.  the  random  measure  Nr{S)  or  its  corresponding  random  density  function  <5p(x). 


fe- .  1 2  Point  processes  can  also  be  defined  as  random  variables  on  unions  of  vector  spaces  [6,  p.  121].  Though  such  a  formulation 

jp .might  seem  intuitively  appealing,  it  is  so  mathematically  restrictive  that  it  is  rarely  used. 

l^%;  §More  general  approaches  allow  the  random  subset  T  to  be  countably  infinite  but  locally  finite,  i.e.  |T  n  I<\  <  oo  for  any 


y  bounded  subset  K.  We  will  not  follow  this  practice  here. 
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Not  all  random  multi-object  systems  of  interest  are  point  processes.  Let  Ay  e  R+  be  a  family  of  random 
positive  real  numbers  indexed  by  y  £  O.  Define  the  random  density  function 

$4,r(x)  =  ;4y  -Sy(x)  (7) 

yer 

and  its  associated  random  measure  Na,t{S)  =  53ygr  AyAy(S).  If  the  A(x)  were  random  positive  integers  then 
NA,r(S)  would  be  a  general  point  process;  otherwise,  it  is  called  a  compound  process.  Compound  processes  are 
nevertheless  mathematically  equivalent  to  simple  point  processes  (finite  random  sets)  on  the  product  space  M+  x  O 
[15,  pp.  5,  16,  22].  In  particular,  there  is  a  sequence  of  notational  correspondences 

MX1  +-..  +  a„<5Xn  <5(0l,Xl)  +  ...  +  fy^.xn)  <=*■  {(ai,xi),...,(ai,xi)}  (8) 

As  we  will  see  later  (section  4.4),  each  of  these  three  notations  can  be  interpreted  as  a  finite  collection  of  point 
object-clusters,  each  such  point  object-cluster  (a,  x)  consisting  of  an  average  number  a  of  objects  all  co-located  at 
x. 


2.1.2.  Probability  generating  functionals 


In  single  object  statistics,  the  statistical  behavior  of  a  random  number  Y  is  often  described  by  generating  functions 
such  as  the  characteristic  function  fy(y)  =E[eiyY],  the  moment-generating  function  My(y)  =E[eI'y>],  the 
factorial  moment-generating  function  Gy(y)  =E[yy>],  and  so  on  [4,  p.  83],  [6,  Chapter  1].  These  functions  are 
called  ’’generating  functions”  because  probability  functions  and  various  kinds  of  moments  can  be  generated  from 
their  iterated  denvatives,  e.g.  (dJMy/dy:,)( 0)  =E [Yj].  In  like  manner,  the  statistical  behavior  of  the  point  process 
T,  NV(S)  or  <5r(x)  is  described  by  various  kinds  of  generating  functionals.  My  emphasis  in  this  paper  will  be 
on  a  point-process  analog  of  the  factorial  moment-generating  function  called  the  probability  generating  functional. 
Given  a  function  h,  this  is  (if  it  exists)  the  expected  value  of  the  product  of  all  h(x)  taken  over  all  elements  x  of 


Gr[/i]  =  E 


.xer 


In  mathematical  point  process  theory  h  must  be  a  very  restricted  type  of  function,  to  ensure  that  the  expectation 
exists  for  very  general  point  processes  T  [6,  p.  141],  [48,  p.  116],  [36,  p.  13],  However,  I  will  follow  the 
practice  of  physicists  in  assuming  that  F  is  sufficiently  ’’nice”  that  Gp[/i]  can  be  defined  for  more  general  functions 
^  in  particular,  functions  that  incorporate  Dirac  delta  functions  <5y(w)  [40,  p.  190],  The  probability  generating 
functional  has  the  property  that  Gri,...,rn  [A]  =  GFl  [/*]  •  •  •  GVn  [fc]  if  r2 , ....  F„  are  statistically  independent. 


2.1.3.  Functional  derivatives 


For  any  functional  F[/i]  and  any  function  g  define 


[4]  =  Urn 

dg  J  e— *0  e 


dgi 


(10) 


if  the  limit  exists.  In  mathematics,  this  is  called  a  ’’Gateaux  derivative.”  Like  physicists,  I  assume  that  F  is 
sufficiently  nice  that  the  transformation  g  i— >  ff  [/i]  is  continuous  and  linear  for  any  fixed  h.  In  this  case 
it  is  also  known  as  a  ’’Frechet  derivative”  and  can  be  computed  using  the  usual  ”tum-the-crank”  rules  of  freshman 
calculus.  Provided  that  they  exist,  the  iterated  Frechet  derivatives  ( dnF/d9l  ■  •  •  dgn)  [A]  are  linear  in  each  argument 
9 1'  •••>  9n-  If  Frechet  derivatives  of  all  orders  exist,  then  F  can  be  expanded  in  a  Taylor’s  series  [40,  p.  190] 


w  =  Ej 


i=0 


i\  d(h  —  ho) 


where 


(FF 


dg 


&F 


dg---dg 


(ii) 


i  times 
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Because  of  continuity  and  linearity  and  since  g  =  f  g  ■  8ydy  (that  is,  g( w)  —  J  g(y)Sy(w)dy  for  all  w),  a 
physicist  would  write 

9F  C  dF 

-^[h]  =  J9{y)Wy[h]dy 

and  then  point  out  that  general  Frechet  derivatives  are  completely  specified  by  the  Frechet  derivatives 


physics 

1Ff~- 


mathematics 


Syi  •  •  •  <5yn 


8h(yi)  ■  ■  ■  Sh(yny 


These  are  known  in  the  physics  literature  as  functional  derivatives  [40,  pp.  173-174].  Equation  12  displays  three 
different  notations  for  these  derivatives:  the  ones  preferred  by  mathematicians  (rightmost)  and  physicists  (center), 
and  the  less  cumbersome  abbreviated  physics  notation  used  in  FISST  and  throughout  this  paper  (leftmost). 

2.1.4.  Janossy  densities 

Given  a  list  yi,...,yn  of  vectors,  the  iterated  functional  derivatives 

ir-(yi’-'y»)=^kI'!lL  03> 

are  the  Janossy  density  junctions  [6,  pp.  122-123]  of  F.  The  family  of  Janossy  densities  of  a  point  process  F  is 
the  multi-object  analog  of  the  probability  density  /y( y)  of  a  random  vector  Y.  That  is,  just  as  /y(y)  describes 
the  likelihood  of  occurrence  of  the  zero-probability  event  Y  =  y,  so  jr>(yi,  — ,  yn)  describes  the  likelihood  of 
occurrence  of  the  zero-probability  event  T  =  {yi,  ...,yn}.  The  Janossy  densities  of  simple  point  processes  are 
completely  symmetric  in  all  arguments,  vanish  whenever  y i  =  yj  for  some  1  <  i  ^  j  <  n  [6,  pp.  134,  Prop. 
5.4.IV],  and  are  jointly  normalized  in  the  sense  that  XXo  ^iJn(yi,  yn)  =  1-  The  relationship  between  the 
probability  generating  functional  and  the  Janossy  densities  can  be  found  by  expanding  Op  [h]  in  a  Taylor  s  series 
expansion  (equation  11)  around  h  =  0  [40,  p.  190],  [36,  p.  15],  [6,  pp.  142,  148]: 

00  j  r 

Gr[h]  =  h{yi)  ■  ■  ■  h(yn)  ir,n(yi, .... yn)dyi  ■•■dyn  (14) 

n=0  "  “ 

2.1.5.  Factorial  moment  densities 

In  ordinary  single-target  statistics,  moments  of  arbitrary  order  of  a  random  vector  Y  with  distribution  /Y(y)  are 
the  tensors  (multi-argument  linear  functions): 


mY)n(yi,-,yn)  =  J  (yi»y)---<y».y)  fv{y)dy 


■;  where  (yi,y2)  is  the  usual  scalar  (i.e,  dot)  product  of  vectors.  The  multitarget  moments  of  a  point  process  T  are 
g:  multi-argument  nonlinear  functions.  Given  a  list  yi,...,yn  of  vectors,  the  iterated  functional  derivatives 

I  <15) 

^factorial  moment  densities  of  the  point  process  T  [6,  pp.  130,  222],  [15,  p.  11],  [48,  pp.  111,116].  They 
iPm.also  be  shown  to  be  the  expected  value 


m.r,[n](yi,-,yn)  =E 


<5yi(wi)---<5y»(wn) 

wi^...^wner 
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where  the  summation  is  taken  over  all  n-tuples  (wi, wn)  of  distinct  objects  [48,  p.  38],  [15,  pp.  10-1 1]. 

Factorial  moment  measures  are  related  to  the  Janossy  densities  and  the  probability  generating  functional  by  the 
equations  [6,  pp.  133,  142, 149,  222],  [36,  pp.  11, 17]: 

00  i  r 

rar,[n](yi.-.y n)  =  Eft?/  7r,[n+fc](yi,...,yn,wi,...,wfc)dwi---dwfc  (16) 

00  (— f 

ir,w(yi»-»y«)  =  E^i  /  mr,[n+)fei(yi>->y«>wi>->w^)dwi'"dw*  (17) 

k= 0  '  J 

00  1  f 

Gr[i  +  fc]  =  E IT  /  Myi)  •  •  •  My*)  ™r,[fc,(yi) ....  yk)dyi  -  •  ^  08) 

A;=0  K‘  7 

If  n  =  0  then  mri[0]  =  1;  whereas  if  n  =  1  then  we  get  the  PHD  described  in  section  1.3:  mrj1](y)  =  Dp(y). 
It  is  clear  from  equation  16  that  mrj[nj(yi,  ...,yn)  =  0  if  yi  =  yj  for  some  i^j  (since  jr,[n](yi)  ...,y„)  has 
the  same  property). 

2.1.6.  Factorial  cumulant  densities 

In  ordinary  single-target  statistics,  the  covariance  moments  of  arbitrary  order  of  a  random  vector  Y  are  the  tensors: 

cy,n(yi, ....  yn)  =  J(  yi  -  y.  y)  •  •  •  (yn  -  y,  y)  fr(y)dy 

where  y  is  the  expected  value  of  Y.  The  multi-object  analog  of  these  quantities  are  the  iterated  functional 
derivatives 

c™(yi . (19) 

which  are  called  the  factorial  cumulant  densities  of  T  [6,  p.  146].  They  are  related  to  the  probability  generating 
functional  by  [6,  pp.  224] 

°o  1  , 

logC7r[l  +  h]  =  E 77  /  h(yi)  ■  ■  ■  h(yk)  cr>b-j(yi, .... yk)dyi  •  •  •  dyk  (20) 

k= o  K'  J 

If  n  =  0  then  crj0]  =  0;  and  if  n  =  1  then  cr)[1j(y)  =  wir,[ij(y)  =  Dr(y)-  If  n  =  2  then  we  get  the  factorial 
covariance  density  [6,  p.  146]: 


<5 2  log  Gr 


SGr^SGr^  ,  <52Gr 


crfy.y.)  = 

=  mr1[2](yi,y2)-£r(yi)A'(y2)  (21) 

Equation  21  can  be  generalized  to  statistics  of  all  orders,  since  the  factorial  cumulant  densities  can  be  written  as 
combinatorial  sums  of  products  of  the  factorial  moment  densities  [6,  p.  147], 


2.2.  Finite-Set  Statistics  (FISST) 

As  previously  noted  and  as  I  explain  more  fully  in  this  and  the  following  sections,  FISST  is  essentially  a  judicious, 
engineering-oriented  distillation  of  point  process  and  related  concepts  drawn  from  stochastic  geometry,  random 
measure  theory,  modem  statistical  physics,  and  expert-systems  theory.  The  theoretical  basis  of  FISST  has  been 
described  in  the  book  Mathematics  of  Data  Fusion  [9,  Chapters  2,  4-8].  The  engineering  motivations  underlying 
FISST  have  been  summarized  in  the  technical  monograph  An  Introduction  to  Multisource-Multitarget  Statistics  and 


Its  Applications  [22]  and  elsewhere  [23],  [21].  Rather  than  repeating  this  material  here,  I  direct  the  reader  to 
those  sources.  Instead,  in  this  section  I  summarize  the  basic  relationships  between  FIS  ST  and  the  (nonstan  ar  ) 
formulation  of  point  process  theory  described  in  the  previous  section. 

I  begin  with  basic  concepts:  the  belief-mass  function  (section  2.2.1),  set  derivative  (section  2.2.2),  multi-object 
density  function  (section  2.2.3),  and  set  integral  (section  2.2.4).  The  first  three  are  the  FISST  counteiparts  of  the 
probability  generating  functional,  functional  derivative,  and  Janossy  densities,  respectively.  I  then  turn  to  engmeer- 
•  ing  concepts:  the  multisensor-multitarget  likelihood  function  (section  2.2.5),  multitarget  Markov  transition  densities 
(section  2.2.6),  and  multitarget  posterior  densities  (section  2.2.7). 

2.2.1.  Belief-mass  function 

In  single-sensor,  single-target  problems,  tracking  engineers  typically  do  not  use  generating  functions  to  describe  a 
random  vector  Y,  but  rather  the  probability-mass  function  py{S)  =  Pr(Y  G  S)  or  its  density  function  My). 
In  like  manner,  FISST  describes  a  point  process  T  not  with  its  probability  generating  functional  (section  2.1.2)  but 
rather  its  ’’belief-mass  function”  /3r(S)  =  Pr(r  C  S)\  The  belief-mass  function  is  the  multi-object  analog  of  the 
probability-mass  function.  Its  value  fr(S)  is  the  total  probability  that  all  objects  in  P  will  be  contained  in  The 
belief-mass  function  and  probability  generating  functionals  are  closely  related.  For  any  closed  subset  5,  let  ls(x) 
be  the  function  defined  by  ls(z)  =  1  if  z  €  5  and  ls(z)  =  0  otherwise.  Then  the  belief-mass  function  is  the 
restriction  of  the  probability  generating  functional  of  F  to  these  charactenstic  functions  (see  section  5.  ). 

MS)  =  Gr[ls] 


2.2.2.  Set  derivative 

Let  Y  =  {yi, yn}-  Then  the  FISST  ’’set  derivative”  of  a  belief-mass  function  (3V  of  a  finite-random  set  (point 
process)  F  is,  if  it  exists, 
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where  Ey  is  a  small  region  that  converges  to  the  singleton  set  {y}  and  X(Ey)  is  its  hypervolume  (i.e.  Lebesgue 
measure).  The  set  derivative  can  be  thought  of  as  a  functional  derivative.  That  is,  if  both  the  respective  set  denva  ives 
and  functional  derivatives  exist,  then  they  are  related  by  (see  section  5.2). 
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2.2.3.  Multi-object  density  functions 

Let  T  be  a  simple  point  process  (i.e.,  random  finite  set).  We  construct  the  multi-object  density  function  fr(X)  of 
r  by  bundling  together  the  Janossy  densities  of  F  (section  2.1.4)  into  a  single  density  function  defined  on  the  space 
of  finite  random  sets: 

«r>  -*,<» . y.)  = 

Mn  stochastic  geometry  and  theoretical  statistics,  (3r (S)  is  known  as  a  specific  kind  of  ’’capacity  measure.”  If  1  is 
%  nonempty  (i.e.,  Pr(r  0)  =  1)  then  /3r(S)  is  a  ’’belief  function”  in  the  sense  of  the  Dempster-Shafer  theory.  Because  o 
Tf  this,  the  FISST  formulation  of  point  process  theory  encompasses  Dempster-Shafer,  fuzzy  logic,  and  rule-based  expert  systems 

approaches. 
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where  Y  =  {yi, yn}.  That  is,  the  multi-object  density  can  be  constructed  as  a  set  derivative.  Also,  Y  =  0 
indicates  that  no  object  is  present,  Y  =  {y}  indicates  that  one  object  with  state  y  is  present,  Y  =  {yi,y2}  indicates 
that  two  objects  with  states  yi  ^  y2  are  present,  and  so  op.  The  multi-object  density  fr(Y)  has  the  form 

/r(0)  =  likelihood  that  no  objects  are  present 
/r({y})  =  likelihood  of  one  object  x 


/r({yi i  •••)  yn})  =  likelihood  of  n  (distinct)  objects  xi , ...,  xn 


2.2.4.  Set  integrals 

Given  any  function  f(Y)  of  a  finite  set  variable  Y,  the  set  integral  is  defined  by 

f  f(X)6Y  =  L  f{{yi,-,yj})dyi---dyj 

Js  jZ qY-  JS  x  ...  x  S 

j  times 


(23) 


for  any  measurable  set  S,  where  by  convention  the  j  =  0  term  in  the  infinite  sum  is  /(0).  If  f  =  fv  for  some 
point  process  F,  then  for  each  j  the  jth  term  in  this  sum  is  the  probability  prj  —  Pr(|P|  =  j)  that  F  contains 
T  objects.  Given  this,  equation  14  can  be  rewritten  as 

Gr[h]  =  J  hY  ■  fr(Y)6Y 

where  hY  =  nyey^(y)-  Consequently,  if  fr(Y)  is  the  multi-object  density  function  of  T,  then  f  fr(Y)SY  = 
Gt[1]  =  1.  Substituting  h  =  Is  yields 

Pr{S)  =  Gr[ls]  =  J  l^fr{Y)6Y  =  J  fr(Y)SY 

The  set  integral  is  inverse  to  the  set  derivative,  in- the  sense  that 


«s)=/s|£(0w  tan- [w  Ssm)ew\s^ 

2.2.5.  Multitarget  likelihood  functions 

Let  E k  be  the  random  observation-set  generated  by  a  multitarget  system  with  multitarget  state  X  at  time-step  k 
and  (3k(S\X)  =  Pr(Efc  C  5|X)  its  belief-mass  function.  Let  Z  —  {  zi,...,zm}  be  a  particular  such  observation- 
set.  Then  the  multitarget  likelihood  function  fk(Z\X)  of  E k  is  the  multi-object  density  function  of  Efc.  That  is, 
it  is  constructed  by  bundling  the  family  of  Janossy  densities  of  E&  into  a  single  density  function  defined  on  finite 
sets  Z  of  measurements.  In  particular,  it  is  an  iterated  functional  derivative  of  the  probability  generating  functional 
of  Efc  [22,  p.  30]: 

f{Z\X)=jsh,n{*U~>*m)  =  f|(0lX) 


2.2.6.  Multitarget  Markov  transition  densities 

Likewise  let  Let  be  the  random  track-set  at  time-step  k  +  1  generated  by  a  multitarget  system  with 

multitarget  state  X  at  time-step  k.  Let  &+i,*(S|X)  =  Pr(rfc+1,fc  C  S\X)  be  its  belief-mass  function  and  let 
Y  -  {yi)  t.ti  yn}  be  a  particular  track-set  at  time-step  k  +  1.  Then  the  multitarget  Markov  density  fk+i\k(Y\x) 
for  rk+i\k  is  ihe  multi-object  density  function  of  Tk+y[k — i.e.,  the  family  of  Janossy  densities  of  the  point  process 
rfc+i|fc,  bundled  together  into  a  single  density  defined  on  finite  sets.  In  particular,  it  is  a  set  derivative  of  the 
belief-mass  function  / 3k+1\k  [22,  p.  30]: 

-.y«)  =  6-^rL9\x) 

2.2.7.  Multitarget  posterior  densities 

Let  Z (fc)  =  {Zi, ...,  Zk}  be  a  time-series  of  multisource-multitarget  observation-sets  Zj  collected  from  the 
randomly-varying  track-set  rk\k  of  a  multitarget  system  at  time-step  k.  The  Janossy  densities  of  this  point  process 
(random  finite  set)  are,  when  bundled  together  into  a  single  density  function  defined  on  all  finite  sets  of  target  states, 
the  ’’multitarget  posterior  density  function”: 

fk\k(X\Z{k))  =irfc|fcln(xi,...,xn)  =  ^(0|  Z^) 

where  /3uk(S\Z^)  =  Pr(rfc[fc  C  S)  is  the  belief-mass  function  of  rfc|fc.  In  other  words,  X  =  0  indicates  that  no 
target  is  present,  X  =  {x}  indicates  that  one  target  with  state  x  is  present,  X  =  {xi,x2}  indicates  that  two  targets 
with  states  xi  /  X2  are  present,  and  so  on.  So, 

A|fc(0| z^)  =  posterior  likelihood  that  no  targets  are  present 
fk\k{{x}\Z(-k))  =  posterior  likelihood  of  one  target  with  state  x 

/*|fc({xi,...,xn}|Z<fc))  =  posterior  likelihood  of  n  targets  with  states  Xi,  ...,Xn 

2.3.  FISST  vs.  Conventional  Point  Process  Theory 

Given  the  existence  of  these  relationships  between  FISST  and  the  more  usual  versions  of  point  process  theory,  what 
is  the  advantage  of  using  FISST?  When  I  published  Mathematics  of  Data  Fusion  in  1997, 1  argued  [9,  pp.  71-72, 
170]  that  FISST  offered  engineers  certain  advantages  over  a  multi-object  statistics  based  on  random  measures.  First, 
it  is  explicitly  geometric  in  that  the  random  variates  in  question  are  actual  sets  of  observations  (visualizable  random 
images)  rather  than  abstract  integer-valued  measures.  Second,  because  FISST  provides  a  systematic  foundation  for 
both  expert  systems  theory  (fuzzy  logic,  Dempster-Shafer  theory,  rule-based  inference)  and  multisensor,  multitarget 
estimation  and  filtering,  it  permits  a  systematic  and  mathematically  rigorous  integration  of  these  two  quite  different 
aspects  of  information  fusion.  Third,  systematic  adherence  to  a  random  set  perspective  results  in  a  formulation 
of  point  process  theory  that  is  nearly  identical  to  the  ’’Statistics  101”  formalism  with  which  tracking  engineers  are 
already  familiar.  In  subsequent  publications  [22],  [23],  [21]  I  spelled  out  what  this  means  in  greater  detail. 

■  •  just  as  single-sensor,  single-target  data  can  be  modeled  using  a  measurement  model  Zk  =  hk{pck,  W^),  so 

W  x .  multitarget  multisensor  data  can  be  modeled  using  a  multisensor-multitarget  measurement  model  a  randomly 
,  j.  varying  finite  set  Zk  =Tk(Xk)UCk(Xk)  where  Tk  and  Ck  indicate  target-generated  and  clutter-generated 
|H;:d  ’  observations,  respectively  [22,  pp.  17-20]; 
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•  just  as  the  single-sensor,  single-target  likelihood  function  fk( z|x*;)  can  be  derived  from  the  probability  mass 
function  pA(Sjxfc)  =  Pr(Z*  6  Sjxfc)  of  the  measurement  model  via  differentiation,  so  the  true  multitarget 
likelihood  function  fk(Z\Xk)  can  be  derived  from  the  belief-mass  function  (3k(S\Xk)  =  Pr(Efc  C  5|X^)  of 
the  multisensor-multitarget  measurement  model,  using  the  set  derivative; 

•  just  as  single-target  motion  can  be  modeled  using  a  motion  model  X*+i  =  ^(x*,  Vk),  the  motion  of 

|  multitarget  systems  can  be  modeled  using  a  multitarget  motion  model — a  randomly  varying  finite  set  Ffc+i  = 

1  §k{Xk,  Vfc)  that  takes  account  of  target  appearance  and  disappearance  [22,  pp.  21-23]; 

•  just  as  the  Markov  transition  density  /fc-fi|*;(xfc+i|xfc)  can  be  derived  from  the  probability  mass  function 

(  Pfc+i|fc('S'lxfc)  =  Pr(Xfc+i  €  £|xfc)  of  the  motion  model  via  differentiation,  so  the  true  multitarget  Markov 

j  transition  density  fk+i\k(^k+i\^k)  can  be  derived  from  /%+i|*(S'|Xfc)  =  Pr(rfc+1)fc  C  S\Xk),  by  applying 

the  set  derivative  to  the  belief-mass  function  of  the  multitarget  motion  model; 

f 

•  just  as  simple  ’’turn  the  crank”  rules  exist  for  freshman  differential  and  integral  calculus,  so  similar  turn  the 

i;  crank  rules  (sum,  product,  chain,  power)  exist  for  the  FISST  differential  and  integral  calculus  [22,  pp.  31-32]. 

I 

I 

|  Finally,  because  FISST  systematizes  these  ’’Statistics  101”  parallels  between  single-object  and  multi-objects 

statistics,  it  provides  a  general  engineering  methodology  for  attacking  multisource-multitarget  data  fusion  problems: 

J  •  Almost-Parallel  Worlds  Principle  (AP WOP):  Nearly  any  single-sensor,  single-target  concept  or  algorithm 

i  can,  in  principle,  be  directly  translated  into  a  corresponding  multisensor,  multitarget  concept  or  algorithm 

!  . 

j; 

i  Since  1 994,  a  long-standing  illustration  of  the  APWOP  has  been  the  concept  of  multitarget  information  measures 

of  effectiveness — for  example,  the  following  multitarget  generalization  of  the  Kullback-Leibler  cross-entropy  [9,  pp. 
205-209;  295-312],  [18],  [20],  [52],  [19]: 


single-sensor,  single-target 

=> 

multisensor-multitarget 

K(f,  g)  =  f  /(x)  log  )  dx 

a 

W,s)  =  ff(X)log(£§)6X 

Using  the  APWOP,  we  simply  replace  conventional  statistical  concepts  on  the  left  with  their  FISST  multisensor, 
multitarget  counterparts  on  the  right.  Specifically,  the  ordinary  densities  /,  g  on  the  left  are  replaced  by  the  multi¬ 
target  densities  /,  g  on  the  right;  and  the  ordinary  integral  f  -dx  on  the  left  is  replaced  by  a  multitarget  set  integral 
f  -SX  on  the  right. 

3.  MULTITARGET  FIRST-MOMENT  FILTERING 

The  purpose  of  this  section  is  to:  introduce  the  concept  of  multitarget  moments  of  all  orders  (section  3.1);  describe 
a  multitarget  filtering  approach  based  on  propagation  of  the  first-order  moment  (the  ’’probability  hypothesis  density” 
or  PHD,  sections  3.2  and  3.3  );  and  summarize  performance  results  for  a  preliminary  implementation  of  this  ’’PHD 
filter”  (section  3.4).  As  previously  noted,  the  PHD  filter  described  in  these  sections  is  a  statistical  multitarget 
analog  of  the  a.-(3- 7  filter.  It  sidesteps  the  computationally  intractable  problem  of  propagating  the  multitarget 
posterior  density  by,  instead,  propagating  only  the  first-order  moment  of  that  density.  This  is  possible  only  under 
the  assumption  that  signal-to-noise  ratio  is  relatively  high.  The  PHD  filter  must  be  implemented  as  a  computational 
nonlinear  filter,  using  analogs  of  equations  1  and  2.  It  tracks  targets  without  making  any  attempt  to  associate  reports 
with  tracks.  For  more  details,  the  reader  is  referred  to  recent  papers  [27],  [17],  [29],  [7].  I  conclude  the  section 
with  an  open  research  problem  (how  to  define  track-valued  rather  than  function-  or  measure-valued  expected  values, 
section  3.5) — and  a  short  history  of  Bayes  recursive  multitarget  nonlinear  filtering  (section  3.6). 
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3.1.  The  Multitarget  Moment  Density  Function 

Recall  that  in  section  2.1, 1  used  an  iterated  functional  derivative  of  the  probability  generating  functional  Grfcl  Jh] 
of  a  random  track-set  (point  process)  rk[k  to  define  the  Janossy  densities  jrt,fc,n(xi,  •••>*«)■  Then  in  section  2.2 
I  bundled  these  densities  together  into  a  single  multitarget  posterior  density  function  /fc|fc(X|Z(fc))  and  showed 
that  it  can  be  computed  as  an  iterated  set  derivative  of  the  belief-mass  function  (3k[k(S)  of  Tk]k.  In  section  2  1 
I  also  introduced  the  concept  of  the  factorial  moment  densities  of  a  point  process.  Applying  equation  15  to  the 
random  track-set  rfc|*,  we  likewise  bundle  the  factorial  moment  densities  mrfclfct[n]  (*i ,  *n)  mt0  a  sinSle  density 

function  Tnk^k(X\Z^)  defined  on  a  finite-set  variable  X  =  and  call  it  the  multitarget  moment  density 

function  of  the  track-set: 

mfc|fc(0|Z(fc))  =  1,  rnk{k(X\zM)  =  mrfc|fci[n](xi,  ...,x„) 


From  equation  15  it  follows  that 


1(X|ZW)=^(S|Z(‘>)|  =  ^»(S|Z«) 


where  S  is  the  entire  state  space.  Thus  the  multitarget  moment  density  can,  like  the  multitarget  posterior  density, 
be  computed  from  pk\k  using  iterated  set  derivatives.  Likewise,  equations  16, 17,  and  18  become 

mklk(X\zW)  =  f  fk\k(Y\zW)6Y  =  J  fk\k(X  UW\Z™)6W  (25) 

»/yDX 

/t|l(X|Z<‘>)  =  f  (— l)1”'1  mklk{Y\zM)6Y  =  /'(-l)1"'1  U  W)Z<»)6W  (26) 

G„»tl  +  *1  =  j  hx  .mt|l(X|Z(‘))6X  .  <27> 

where  hx  =  l\xeX  K*)  Sivcn  x  =  {xi,  •••>  xn}-  Equations  25  and  26  tell  us  that  the  multitarget  posterior 
and  multitarget  moment  densities  are  interchangeably  computable  from  each  other.  Equation  25  additionally  tells 
us  that  mk]k(X\Z^)  is  the  marginal-posterior  likelihood  that,  regardless  of  how  many  targets  there  may  be  in  the 
multitarget  system,  exactly  n  of  them  have  states  xi,...,xn. 

In  particular,  the  first  moment  density  (or  ’’probability  hypothesis  density,  PHD  [46],  [9,  pp.  168  169])  has  the 
form 

Dt|t(x|zW)  =  mi|t(x|Z<‘>)  =  JxJ#(.X\Z^)SX  =  j  A|*({x}U  Y|Z<‘>)«Y 

For  any  measurable  subset  S  C  S  of  state-vectors,  the  expected  number  of  targets  in  S  is  [27,  p.  106],  [17, 
Theorem  2],  [9,  p.  169] 

Nklk  =  E[  |rfc|k  OS\]  =  Js  Dfc|fc(x|Z(fc))dx  (28) 

This  property  characterizes  the  PHD  uniquely.  That  is,  if  £fc(x)  is  any  other  density  which  gives  the  expected 
p  number  of  targets  in  5  when  integrated  over  S,  then  it  is  (no  matter  how  imaginative  the  name  or  notation  one  might 
■  assign  to  it)  nothing  else  but  the  PHD.  For,  since  Jsgk(x)dx  =  fs  Dkjk(x\Z^)dx  for  all  measurable  S  then 
3k\k  —  Dk |fc  almost  everywhere.  A  typical  example  of  a  PHD  is  pictured  in  Figure  2. 
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3.2.  Basic  Idea  Behind  the  PHD  Filter 

The  theoretically  optimal  foundation  for  multitarget  detection,  tracking,  and  identification  is  the  following  general¬ 
ization  of  the  recursive  Bayes  filter  equations  1,  2,  and  3: 

fk+i\k(Xk+1\zW)  =  f  fk+llk(Xk+1\Xk)  fklk(Xk\zW)6Xk  (29) 

*  fk(Zk+i\Xk+i)  fk+1\k(Xk+1\Z^) 

A+.I*+1(X*+1|Z  )  -  A+1<zt+1|Z<‘>)  (30) 

r\x\ 

1  =  <31> 

where  [22,  pp.  47-49],  [9,  p.  237-243]: 

(1')  Xk  is  the  multitarget  state,  i.e.  the  set  of  unknown  target  states  (which  are  also  of  unknown  number)  and 
Zk  is  the  set  of  all  measurements  collected  from  all  targets  at  time-step  k; 

(2')  fk\k{Xk\Z^)  is  a  multitarget  posterior  density  at  time-set  k  conditioned  on  the  time-stream  Z^>  — 
{Zi, Zk }; 

(30  fk(Z\X)  is  the  multisensor,  multitarget  likelihood  function  that  describes  the  likelihood  of  observing  the 

observation-set  Z  given  that  the  multitarget  system  has  multitarget  state-set  X; 

# 

(40  fk+i\k(Xk+i\Xk)  is  the  multitarget  Markov  transition  density  that  describes  the  likelihood  that  the  targets 
will  have  state-set  Xk+\  at  time-step  k  +  1  given  that  they  had  state-set  at  time-step  k; 

(50  fk+i\k(Xk+i\Z^)  is  the  time-prediction  of  the  multitarget  posterior  fk\k{Xk\Z^)  to  time-step  fc  + 1;  and 

(60  fk+1(Zk+1\Z^)  =  f  fk(Zk+1\Y)  /fc-fi|fc(F|ZW)5y  is  the  Bayes  normalization  constant. 

(A  short  history  of  Bayes  multitarget  filtering  can  be  found  in  section  3.6.)  The  multitarget  filtering  equations  29, 
30  and  31  cannot  be  copied  from  the  single-target  filtering  equations  1,  2,  and  3  in  the  blind  fashion  just  suggested 
[22,  pp.  1-6,  91-93],  [21],  [25],  [24],  For  example, 

(70  the  direct  multitarget  generalization  of  the  MAP  estimator  is  not  defined  in  general  (and,  as  previously 
indicated,  the  multitarget  generalization  of  the  posterior  expectation  still  resists  definition);  Xk^k+1  is  a  specially- 
defined  multitarget  analog  of  the  MAP  estimator  [22,  40-44]; 

(80  the  integrals  J  ■ SXk  and  f  -8Y  are  set  integrals. 

Clearly,  if  the  single-target  Bayes  filter  equations  1,  2,  and  3  are  already  computationally  daunting,  then  the 
multitarget  filter  equations  29, 30,  and  3 1  will  be  computationally  intractable  in  most  circumstances.  So,  we  proceed 
by  analogy  from  the  single-target  case.  There,  we  assumed  that  SNR  is  so  high  that  the  posterior  is  approximately 
completely  characterized  by  its  first  moment  statistic  and  then  propagated  ±k\k  alone  using  a  constant-gain  Kalman 
filter.  The  idea  underlying  the  PHD  filter  is  to  extend  this  reasoning  to  the  multitarget  case.  We  assume  that  SNR 
is  so  high  that  the  first-order  moment  (the  PHD)  of  the  multitarget  system  is  an  approximate  sufficient  statistic: 
fk\k{^k\Z^k'))  =  fk\k{,Xk\Dk\k).  We  then  “fill  in  the  question  marks”  in  the  following  diagram, 


|  ^  ^ multitarget  prediction 

fk+\\k  _ ^multitarget  Bayes’  rule 

/fc+iifc+i 

> 

i 

_ ^moment  prediction?? 

f)  _ ^moment  data  update?? 

1  j/c 

Dk+l\k+l 

where  the  top  row  portrays  the  time-evolution  of  the  multitarget  Bayes  filtering  equations  29  and  30;  and  where  the 
downward-pointing  arrows  indicate  the  replacement  of  multitarget  posteriors  by  their  corresponding  PHD’s.  Ideally, 
this  diagram  would  be  Bayes-closed  in  the  sense  of  Kulhavy  [12]  and  litis  [11],  meaning  that  for  any  k: 
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1.  the  two  time-update  paths  fk\k  - 4  A+i|&  ~ 4  As+ijfc  and  fk\k  ~ 4  -D*|fc  — 4  Ac+i|fc  yield  the  same  -D/c+:i|fc> 

2.  the  two  data-update  paths  fk+i\k  — 4  fk+i\k+i  —4  Afc+i|fc+i  an<^  A+i|fc  4  Dk+i\k  ~ 4  ^fc+i|fc+l  y^d 
same  £)fc+1|fc+i;  and 

3.  JDfc+i|fc  is  a  “best-fit  approximation”  of  /fc+i|k,  and  Dfc+i|fc+i  is  a  “best-fit  approximation”  of  fk+i\k+b  'm 
some  explicitly  specified  sense. 

If  all  three  of  these  properties  were  satisfied,  the  filter  indicated  in  the  bottom  row  of  the  diagram  would  be 
theoretically  guaranteed  to  not  diverge.  That  is,  it  would  always  produce  exactly  the  same  best-fit  Dk\k  or 
Dk+i\k  that  one  would  get  if  one  could  implement  the  optimal  filter  in  the  top  row  and  then  compress  the  respective 
multitarget  posteriors  into  their  corresponding  first  moments.  In  the  approach  proposed  in  this  paper,  property  (2)  is 
only  approximately  satisfied.  The  PHD  filter  is  based  on  the  following  assumption:  the  random  multitarget  track-set 
Fk\k  is  approximately  equal  to  the  Poisson  process  ( Poisson  track- cluster)  f  fc|fc  that  most  closely  resembles  it.  By 
definition,  V  is  a  multidimensional  (spatial)  Poisson  process  with  intensity  function  D(x)  if  [48,  p.  33],  [43,  pp. 
86-87],  [15,  p.  6]: 


1  for  all  i  >  1,  r  n  Si, ...,  r  n  Si  are  statistically  independent  random  sets  whenever  Si,... ,5*  are  mutually 
disjoint  bounded  subsets;  and 

2.  the  number  of  elements  of  T  in  any  bounded  subset  S  is  Poisson  distributed:  Pr(|F  fl  S\  —  n)  = 
e-D(s^D(s)lt  where  D(S)  =  fsD(x)dx  is  called  the  intensity  measure. 

The  belief-mass  function  of  a  Poisson  process  has  the  form  f3(S)  -  exp  (D(S)  -  N )  where  N  —  J  D(x)dx, 
the  expected  number  of  objects,  is  given  by  equation  28.  The  multi-object  moment  density  function  and  multi-object 
posterior  of  a  Poisson  track-set  have  the  simple  forms  [6,  p.  226]: 

* 

m*|k(0)  =  1.  ™k\k(x)  =  II  ^32') 

xeX 

/*|fc(0)  =  e-"^,  A,fc(X)=e-^l*  II%(X)  (33) 


Equation  33  tells  us  that  the  joint  density  of  the  tracks  xi, ...,  x„,  given  that  there  are  n  tracks,  is 


fk\ki.X-li 


*. z<i))  -  h.  . *.}*  *»>  -  = iim*> 


where  /fe|fc(xi, ...,  xn| n,  Z(fc))  is  the  joint  posterior  distribution  of  the  vector  (xi, ...,  x„)  and  where  sk |*(x)  = 

.  £>*|fc(x) /Nk\k  is  the  spatial  probability  distribution  of  each  track  [43,  p.  90].  That  is,  given  the  number  of  tracks, 
their  locations  are  independent,  identically-distributed  random  vectors.^ 

We  now  choose  fk \k  so  that  it  is  the  multitarget  distribution  of  the  Poisson-distributed  track-cluster  Fk\k 
that  best  approximates  the  posterior  track-set  r^,  in  the  sense  that  it  minimizes  the  multitarget  Kullback-Leibler 
discrimination  [9,  pp.  205-215]: 

:‘:V  *(/.,»;  fro - /  Ait(^iz(t))i°g  (A/^(xp)  ^  (34) 

"This  fact  tells  us  how  to  construct  Poisson  track  clusters.  Let  X* ,  X2 , .. .  be  an  infinite  sequence  of  independent,  identically 
!*,  distributed  random  track- vectors  and  let  v  be  a  Poisson-distributed  random  integer:  Pv(v  =  n)  =  e~^^kN^k/n\.  Then 
fe  ►  T  =  (Xi,  ...,x,;}  is  a  Poisson  track  cluster,  where  it  is  assumed  that  T  =  0  whenever  i/  =  0. 
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Minimization  occurs  (see  section  5.3)  when  £>A|fc(x)  =  Dk\k{x\Z^>).  This  in  turn  implies  that  Nk\k  =  Nk 
where  Nk\k  =  J  Dk\k{x\Z^)dx.  is  the  expected  number  of  tracks  at  time-step  k.  Consequently:  when  we  say 
that  the  first-order  moment  Dk\k  can  be  propagated  in  place  of  the  multitarget  posterior  what  we  are  actually 
saying  is  that  we  can  propagate  fk\k  in  place  of  fk\k  because  it  is  a  reasonably  good  approximation  in  the  sense 
that  the  track-set  rfc|fc  strongly  resembles  the  Poisson  track-cluster  ffc|fc:  i.e.,  fk\k  =  fk\k. 

3.3.  The  Bayes  First-Moment  Filtering  Equations 

Given  these  preliminaries,  it  can  be  shown  [17,  Theorem  4],  [27]  that  between  measurement-collection  times,  the 
PHD  can  be  propagated  in  time,  without  approximation ,  using  the  exact  prediction  integral 

Dk+i\k(y\Z{-k))  =  J  (4+i|fc(x)/*+i|fc(y|x)  +  6fc+i|fc(y|x))  Dk\k{x\Z^)dx.  (35) 

where 

(1")  1  -  4+i|fc(x)  is  the  probability  that  a  target  with  state  x  at  time-step  k  will  disappear  from  the  scene  at 
time-step  k  + 1; 

(2'0  4+1|fc(y|x)  is  the  PHD  of  the  multitarget  density  4+i|fc(^W  that  describes  the  likelihood  that  a  target 
with  state  x  at  time-step  k  will  generate  a  set  X  of  new  targets  at  time-step  k  +  1; 

If  the  track-set  strongly  resembles  a  Poisson  track-cluster  then  it  can  be  shown  [17,  Theorem  5],  [27]  that  the 
following  approximate  equation  is  the  Bayes  update  of  the  PHD  using  a  new  scan  Zk+ 1  of  data: 


4+1|*+i(*|z<*«>)  a  £  - - nk+ WxK2W) 

zez^+i  /Wicfc+i(z)  +pDDk+i(z\ZW) 


(36) 


where: 

(3")  Nk+i\k  =  J  Dk+i\k(x\Z^)dx  is  the  predicted  expected  number  of  targets  at  time-step  k  +  1; 

(4")  pD  is  the  (state-independent)  probability  of  detection  of  the  sensor,  assumed  to  be  large  enough  that 
Pd>  1  -  N~'1[k  for  any  k; 

(5")  Afc+i  is  the  average  number  of  Poisson  false  alarms  per  data-scan,  and  cfc+1(z)  is  the  (state-independent) 
distribution  of  each  of  these  false  alarms; 

(6")  Dk+1(z\ZW)  =  //(z|x)  Dk+1\k(x\Z^)dx;  and 

(7")  Dk+1\k+1(-x\z,  Z^)  =  /(z|x)  Ac+i|fc(xl^^)  DA+1(z|Z(A))_1  is  a  Bayes’ rule-like  update  of  Dk+1\k(x\Z^) 
using  the  observation  z. 

It  can  also  be  shown  [17,  Theorem  6]  that  equation  36  is  easily  extended  to  deal  with  multiple  sensors,  assuming 
that  the  observation-sets  collected  by  these  sensors  are  conditionally  independent. 

In  other  words:  if  SNR  is  large  enough  then  multitarget  detection,  tracking,  and  identification  can  be  ac¬ 
complished  using  a  process  that  strongly  resembles  single-target  nonlinear  filtering— and  that  also  does  not  re¬ 
quire  report-to-track  association.  Rather,  data  association  is  essentially  replaced  by  multi-peak  extraction.  At 
each  stage,  the  PHD  filter  propagates  not  only  the  PHD  Dk^k(x\Z^)  but  also  the  expected  number  of  targets 
Nk\k  =  /  Dk\k{x\Z^)dx.  Consequently,  estimation  of  the  multitarget  state  is  accomplished  by  computing  the 
nearest  integer  [Nk{k]  in  Nk]k  and  then  searching  for  the  [A^J  largest  peaks  of  Dk[k(x\Z^). 
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Figure  1.  Simple  Test  Scenario.  Target  1  enters  at  upper  right  of  the  scene,  Target  2  enters  at  lower  left,  and  Target 
3  is  near  the  center  and  motionless.  Target  1  overruns  Target  3  at  the  time  of  38th  observation. 


3.4.  PHD  Filter:  Preliminary  Results 

>  The  purpose  of  this  section  is  to  describe  the  performance  of  a  preliminaiy  implementation  of  the  PHD  filter,  as 

applied  to  a  simple  model  problem.  This  work  is  described  more  fully  in  recent  conference  papers  [7],  [29]. 

,  The  PHD  filtering  equations  35  and  36  were  implemented  using  a  ’’spectral  compression”  computational  nonlinear 

I :  filtering  technique  developed  at  Lockheed  Martin  Tactical  Systems.  A  simple  multi-peak  extraction  algorithm  was 

ji  iiv  used  to  find  the  N  largest  peaks  of  the  PHD  at  each  time-step,  where  N  is  the  integer-valued  expected  number  of 
targets  supplied  by  the  PHD  filter.  This  peak  extractor  divides  the  scenario  into  a  quantized  grid  and  searches  for 
maximal  values.  Localization  accuracy  is  thus  limited  by  the  quantization  error. 

The  PHD  filter  was  applied  to  the  simple  three-target  model  scenario  pictured  in  Figure  1 .  Two  moving  targets 
enter  the  scene  from  upper  right  and  lower  left,  whereas  a  third  target  is  motionless  at  the  center.  One  target  overruns 
'  *  the  motionless  target  at  the  time  of  the  38^  observation,  generating  confusion  about  target  location.  The  scene  is 
observed  by  a  single  Gaussian  sensor.  In  keeping  with  the  assumptions  that  underlie  the  PHD  filter,  signal-to-noise 
■if.  ratio  is  assumed  to  be  high:  a  =  .01.  There  are  no  false  alarms,  and  target  number  is  assumed  to  be  unknown  but 

■  v.,:'  unchanging.  A  simple  instantaneous  straight-line  motion  model  is  used.  With  these  assumptions,  the  PHD  filtering 

I  If;  '  equations  35  and  36  reduce  to  the  simple  form: 
t  ■■.  '■ 

if-  4+1|„(y|z<*>)  =  //MiUyW 4+m+iWz(‘+1))  =  £  At+ii*+iM*> zw) 

f,  '•  .■ 

^rresu^ts  316  as  follows. 

I  1  ’ 1  Figure  2  shows  the  graph  of  the  PHD  at  the  time  of  the  35th  observation.  Despite  the  fact  that  some  confusion 
^get  1  and  Target  3  is  beginning  to  occur,  the  peaks  corresponding  to  the  three  targets  are  clearly  separated. 


target  2 
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Figure  2.  PHD  at  35th  Time-Step.  Since  Target  3  is  motionless,  the  peak  corresponding  to  its  location  is  large 
compared  to  the  peaks  for  the  other  two  targets. 


Because  Target  3  is  motionless  there  is  less  confusion  regarding  its  position,  and  so  its  peak  is  higher  than  the  peaks 
corresponding  to  the  other  two  targets. 

Figure  3  shows  the  tracking  performance  of  the  PHD  filter  throughout  the  scenario,  using  a  measure  of  perfor¬ 
mance  called  the  ’’Hausdorff  multitarget  miss  distance”  [52],  [22,  p.  46],  [9,  pp.  137-138],  The  Hausdorff  distance 
dHaus{G,  X )  is  the  multitarget  analog  of  the  single-target  concept  of  Euclidean  miss  distance  ||g  -  x|  |  between  a 
ground-truth  target  g  and  a  track  estimate  x.  It  is  defined  as: 

dHaus(G,X)  =  max{d{)(G,X),  do(X,G)},  do(G,  X)  =  maxmin  ||g  —  x|| 

QE.G  x.EX 

where  G  is  the  set  of  ground  truth  targets  and  X  a  the  set  of  track  estimates.  In  our  case  G  =  {gi,g2,g3}  is  the 
indicated  ground  truth  at  a  particular  observation-time,  and  X  =  (xi ,  x2,  x3}  are  the  track  locations  generated  by 
the  PHD  via  peak  extraction,  at  the  same  observation-time. 

Figure  3  shows  that,  over-all,  the  PHD  filter  successfully  tracks  the  targets  even  though  it  does  not  try  to  associate 
reports  with  tracks.  Tracking  accuracy  is  limited  by  the  quantization  error  of  the  peak-extraction  algorithm  rather 
than  by  the  PHD,  and  is  approximately  a  =  0.005.  It  is  good  even  before  and  after  the  time  of  the  38t/l  time-step, 
the  time  of  maximal  target-to-target  confusion.  The  large  error  spikes  at  the  35tft  and  40th  observations  are  caused 
by  failures  in  the  peak-extraction  algorithm  and  not  the  PHD  filter  itself.  (Figure  2  shows,  for  example,  that  the 
peak  extractor  failed  at  the  35tft  observation  even  though  the  three  peaks  are  clearly  separated.)  In  these  cases,  the 
peak  extractor  failed  to  find  one  of  the  peaks  and  assigned  two  target  locations  to  another  peak. 

This  experiment  demonstrates  that  the  development  of  good  multi-peak  extraction  algorithms  is  critical  to  the 
successful  implementation  of  a  PHD  filter.  It  also  shows  that,  to  a  great  extent,  peak  extraction  plays  the  same 
role  in  a  PHD  filter  that  report-to-track  association  does  in  a  conventional  multi-hypothesis  multitarget  tracker.  The 
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Figure  3.  Tracking  Performance  of  PHD  Filter.  The  Hausdorff  multitarget  miss  distance  (vertical  axis)  is  plotted 
versus  observation  number.  Localization  accuracy  is  good  even  when  Target  1  overruns  Target  3  at  the  time  of  the 
38th  observation.  The  two  large  peaks  at  the  35th  and  40th  time-steps  are  due  to  peak-extraction  failures. 

* 

'•  computational  complexity  involved  in  data  association  is  not  entirely  sidestepped,  but  rather  is  partially  transferred 

to  the  peak-extraction  process. 

3.5.  Open  Research  Question:  Expected  Track-Sets 

;  Many  definitions  of  the  set-valued  expectation  E[T]  of  a  random  set  F  have  been  proposed  [9,  pp.  176-177],  but 

none  of  these  are  applicable  to  random  finite  sets— i.e.,  to  simple  point  processes.  At  the  1998  GTRI/ONR  Workshop 
on  Tracking  and  Sensor  Fusion  [28]  I  suggested  a  definition  of  a  track-valued  multitarget  expected  value  of  a  random 
I  track-set,  based  on  the  idea  of  transforming  finite  sets  into  multinomials  (similar  to  a  similar  transformation  proposed 

earlier  [9,  p.  179]).  This  definition  of  E[T]  produces  intuitively  acceptable  multitarget  expectations  in  special  cases, 
if  •  For  example,  if  the  track-set  T  =  {X!,  ..,Xn}  is  the  union  of  statistically  independent  tracks  Xi,  ...,Xn  whose 
H  .  respective  expected  values  are  Xj , Xn,  then  E [T]=  {Xi , ....  Xn}.  However,  the  approach  does  not  appear  to 
■K  be  satisfactory  in  general.  The  problem  of  correctly  defining  E[T]  is  an  open  research  question.  In  this  section  I 

'  |  argue  (as  I  did  in  the  1998  workshop)  that  any  acceptable  definition  of  E[T]  should  have  the  following  properties: 

1 7’  •  Consistency:  if  X  is  any  random  vector  then 

E[{X}]  =  {E[X]} 

That  is,  the  expectation  of  a  random  track-set  that  always  contains  a  single  random  track  should  itself  contain 
only  a  single  track;  and  this  track  should  be  the  expected  value  of  the  random  track.  {X}  —  {X} 

•  Empty  track-sets  are  ignored :  Let  T  be  a  random  track-set  and  let  E[T|r  ^  0]  denote  the  conditional 
expectation  of  F  given  that  it  is  non-empty.  (That  is,  whereas  E[F]  is  the  expectation  with  respect  to  the 
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multitarget  density  fr{X),  E[r|T  ^  0]  is  the  expectation  with  respect  to  the  multitarget  density  fr\r^d 
defined  by  fr\r^(X)  =  (1  -  /r(0))_1/r(X)  when  X  ^  0  and  /r|r^0(0)  =  0  otherwise.)  Then 

E[T]  =  E[r|IV  0] 

That  is,  empty  track-sets  should  provide  no  contribution  to  the  value  of  a  multitarget  expected  value. 

•  Preservation  of  geometry:  Let  ri,r2  be  statistically  independent  track-sets.  Then 

E[r1ur2]  =  E[r1]uE[r2] 

That  is,  when  two  random  track-clusters  are  statistically  non-interacting,  their  joint  multitarget  expected  value 
should  just  be  the  union  of  the  multitarget  expected  values  of  the  individual  track-clusters.  For  example, 
{X1|X2}  =  {X1|X2}  if  X!,X2  are  independent  random  vectors. 

•  Preservation  of  affine  transformations :  Let  T  be  a  random  track-set  and  a  a  constant  track- vector  and  T(x) 
any  linear  transformation  of  track-vectors  x.  Define  T({xi,...,xn})  =  {Txi,...,Txn}  and  X  +  {a}  = 
{x  +  a|  x  6  X}  for  any  subset  X.  Then 

E[T(r)  •+  {a}]  =  T(E[r])  +  {a} 

That  is,  to  Compute  the  multitarget  expected  value  of  a  random  track-set  after  a  (possibly  degenerate)  affine 
transformation  x  i — ->  T(x)  +  a  of  the  coordinate  system,  just  apply  the  same  transformation  to  its  expected 
value.  For  example,  {a  •  Xi  +  a,  a  ■  X2  +  a}  =  a  •  {Xi ,  X2}  +  {a}  where  Xi ,  X2  are  (not  necessarily 
independent)  random  vectors  and  a  is  a  constant. 

3.6.  Related  Approaches 

The  idea  of  using  a  single-target  density  function  gk\k  (or  the  probability  contours  of  its  graph)  as  a  basis  for 
multitarget  tracking  is  a  relatively  common  one.  Examples  of  implemented  algorithms  are  the  Naval  Research 
Laboratory’s  TABS  (Tactical  Antisubmarine-warfare  Battle-management  System)  tracker,  Metron  Corp.’s  Nodestar 
tracker  [47]  and  the  “probabilistic  mapping”  multitarget  tracking  approach  of  Tao,  Abileah,  and  Lawrence  [49].  The 
PHD  filter  differs  from  this  work  in  that  it  (1)  provides  a  solid  theoretical  foundation  for  single-density  approaches  in 
general,  (2)  clarifies  the  relationship  between  single-density  approaches  and  the  optimal  multitarget  filter,  (3)  makes 
explicit  the  theoretical  assumptions  that  implicitly  underlie  single-density  approaches  (in  particular,  the  assumption 
of  high  SNR),  and  (4)  validates  Stein  and  Winter’s  intuition  that  the  PHD  is  a  ’’theoretically  correct”  choice  of  gk\k 
as  well  as  their  concept  of  weak  evidence  accrual.”  Beyond  this,  the  PHD  approach  leads  to  new  practical  insight 
and  new  implementations  because  the  PHD  can  be  propagated  by  conventional  nonlinear  filtering  equations. 

The  concept  of  multitarget  Bayesian  nonlinear  filtering  (equations  29  and  30)  is  a  relatively  new  one.  If  one 
assumes  that  the  number  of  targets  is  known  beforehand,  the  earliest  exposition  appears  to  be  due  to  Washburn 
[51]  in  1987,  who  used  a  random  measure-based  point  process  approach.  When  the  number  n  of  targets  is  not 
known  and  must  be  determined  along  with  the  individual  target  states,  the  earliest  work  appears  to  be  due  to  Miller, 
O  Sullivan,  Srivastava,  et.  al.  [44],  [45].  Their  very  sophisticated  approach  utilizes  solution  of  stochastic  diffusion 
equations  on  non-Euclidean  manifolds.  It  is  also  apparently  the  only  approach  to  deal  with  continuous  evolution  of 
the  multitarget  state.  Mahler  was  apparently  the  first  to  systematically  deal  with  the  general  discrete  state-evolution 
case  (Bethel  and  Paras  [3]  assume  discrete  observation  and  state  variables).  Stone  et.  al.  have  provided  a  valuable 
contribution  by  clarifying  the  relationship  between  multitarget  Bayes  filtering  and  multi-hypothesis  correlation  [47, 
pp.  161-208],  [9,  p.  32],  [22,  p.  48],  Nevertheless  their  approach  is,  with  regrets,  best  described  as  “heuristic” 
for  reasons  described  more  fully  elsewhere  [22,  pp.  91-93].  Kastella’s  “joint  multitarget  probabilities  (JMP),” 
introduced  at  Lockheed  Martin  Tactical  Systems  in  1996,  are  a  renaming  of  a  number  of  early  core  FISST  concepts 
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(set  integrals,  multitarget  information  metrics,  multitarget  posteriors,  joint  multitarget  state  estimators,  etc.)  devised 
two  years  earlier  [37,  pp.  27-28],  [25].  Portenko  et.  al.,  also  using  a  random  measure-based  point  process  approach, 
use  branching-process  concepts  to  model  target  appearance  and  disappearance  [38]. 

Most  recently,  M.  Kouritzin  has  noted  [16]  that  the  theory  underlying  particle-systems  filters  also  subsumes  point 
processes  and  therefore  multitarget  tracking  as  well.  In  particular,  this  means  that  particle  system-based  multitarget 
nonlinear  filters  will  have  very  good  convergence  properties  (i.e.,  for  every  multitarget  observation  sequence,  the 
particle  density  will  converge  almost  surely  to  the  actual  multitarget  distribution).  It  also  means  that  particle- 
system  methods  can  be  applied  to  single-  and  multitarget  systems  whose  motion  models— e.g.,  heavy-tail  or  non- 
differentiable  models— are  too  ill-behaved  to  be  describable  by  conventional  Kolmogorov-Forward  (Fokker-Planck) 
equations. 

It  should  also  be  pointed  out  that  Mori,  Chong,  et.  al.  first  proposed  random  set  theory  as  a  potential  foundation 
for  multitarget  detection,  tracking,  and  identification  (although  within  a  multi-hypothesis  framework)  [35,  pp.  33- 
37].  Since  1995  Mori  has  returned  to  the  field  and  published  a  number  of  very  interesting  papers  based  on  random 
set  ideas  [32],  [33],  [34]. 


4.  SECOND-ORDER  MULTITARGET  STATISTICS 

The  PHD  filter  described  in  section  3  makes  use  only  of  the  first-order  multitarget  moment  statistic  and,  in  this  sense, 
is  a  statistical  analog  of  single-target,  constant-gain  Kalman  filters  such  as  the  a-/?-7  filter.  The  Kalman  filter  arises 
when  we  try  to  propagate  the  second-order  as  well  as  first-order  statistics  of  a  target.  The  extended  Kalman  filter 
(EKF)  results  when  we  expand  the  likelihood  functions  and  Markov  densities  in  Taylor’s  series,  in  order  to  propagate 
second-order  approximations  of  the  single-target  posterior  density  function.  It  is  natural  to  ask  whether  analogous 
things  can  be  done  for  multitarget  systems.  First,  can  We  propagate  a  second-order  approximation  of  the  multitarget 
posterior  density  (i.e.,  the  PHD  and  a  multitarget  covariance)  instead  of  the  multitarget  posterior  itself?  If  so,  we 
would  have  a  statistical  multitarget  analog  of  the  Kalman  filter.  Second,  can  we  expand  the  multitarget  likelihood 
function  and  multitarget  Markov  density  in  some  kind  of  ’’multitarget  Taylor  s  series  ?  If  so,  we  would  have  a  leg  up 
on  defining  a  direct  multitarget  analog  of  the  EKF.  The  purpose  of  this  section  is  to  discuss  these  issues.  Our  results 
are  preliminary,  but  can  be  summarized  as  follows.  At  this  time,  it  does  not  appear  likely  that  a  computationally 
tractable  ’’multitarget  Kalman  filter”  can  be  constructed  from  second-order  multitarget  moment  statistics.  However, 
it  appears  that  it  might  be  possible  to  develop  a  multitarget  analog  of  the  EKF,  though  the  potential  computational 
practicality  of  a  ’’multitarget  EKF”  remains  to  be  seen. 

The  section  is  organized  as  follows.  In  section  4.1  I  introduce  the  multitarget  analog  of  covariance-moment 
statistics,  the  multitarget  covariance  density  function.  Then,  in  section  4.2, 1  discuss  the  possibility  of  constructing 
multitarget  filters  based  on  second-order  multitarget  covariance  statistics — i.e.,  statistical  multitarget  analogs  of  the 
Kalman  filter.  In  section  4.3  I  revisit  the  EKF  and  identify  a  ’’missing  link”— namely,  multitarget  Taylor’s  series 
expansions — that  must  be  dealt  with  before  any  multitarget  analog  of  the  EKF  can  be  constructed.  Finally,  in  section 
4.4, 1  introduce  a  generalization  of  the  concept  of  a  point  target — the  ’’point-Poisson  track-cluster  and  show  how 
it  leads  to  Taylor’s  series  expansions  of  the  multitarget  log-likelihood. 

4.1.  The  Multitarget  Covariance  Density  Function 

In  section  2.1.61  introduced  the  factorial  cumulant  densities  of  a  point  process.  Applying  equation  1 9  to  the  random 
track-set  ,  we  bundle  the  factorial  cumulant  densities  crfc|fc,[n](xi>  •••>  xn)  int°  a  single  density  function  defined 
on  a  finite-set  variable  X  =  {xi, ...,  xn}  and  call  it  the  multitarget  covariance  density  function  of  the  track-set: 

cfc|fc(0|Z^)  =  0,  Ck\k{X\Z^)  =  crfc|fci[n](xi,  ...,xn) 
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From  equation  19  it  follows  that 


cm(X\zW)  =  S'0^(S\Z<») 


where  S  is  the  entire  state  space.  Thus,  the  multitarget  covariance  density  can,  like  the  multitarget  posterior  and 
moment  density  functions,  be  computed  using  iterated  set  derivatives.  Likewise,  equation  20  becomes 

logG*lfc[l  +  h]  =  Jhx •  ck\k(X\Z™)6X 

where  as  usual  hx  =  rLcex  Mx)  given  X  =  {xi,...,x„}.  As  before,  we  note  that  ck\k({x}\Z^)  = 
mk\k({x.}\Z^)  =  Dk\k(-x\Z W)  and  we  give  the  second-order  covariance 

Cfc|fc(xi,x2|Z(':))  =  ck  |fc({xi,x2}|Z(fc)) 


a  special  name:  it  is  the  covariance  density  of  the  random  track-set  rW. 

As  an  example,  if  rk\k  is  a  Poisson  track-cluster  then  ck\k{X\Z^)  =  0  for  all  |A|  >  1  [6,  p.  226]. 

4.2.  Second-Order  Filtering 

The  PHD  filter  of  section  3.3  was  based  on  the  assumption  that  the  PHD  is  an  approximate  sufficient  statistics  for 
the  multitarget  posterior  fk\k(X\Z(-k'>).  In  other  words,  the  track-set  Tk\k  is  assumed  to  be  approximately  Poisson 
and  thus  the  multitarget  posterior  is  approximately  the  multitarget  density  function  of  a  Poisson  process.  Stated  in 
the  language  of  multitarget  covariances,  this  is  the  same  thing  as  insisting  that  ck\k{X\Z^>)  =  0  for  all  \X\  >  1. 
Suppose  that  we  instead  insist  that  ck\k(X\Z(k})  =0  for  all  \X\  >  2.  This  means  that  Fk\k  would  be  the 
formal  point  process  statistical  analog  of  a  Gaussian  random  vector.  Such  a  track-cluster  is  known  in  point  process 
theory  as  a  ’’Gauss-Poisson  process”  [6,  266-267].  Any  Gauss-Poisson  process  is  essentially  just  a  Poisson  process 
in  which  each  track  is  replaced  by  a  pair  of  correlated  tracks  [6,  pp.  247-248]. 

If  we  were  to  follow  the  strategy  used  in  the  derivation  of  the  PHD  filtering  equations  35  and  36,  we  would 
proceed  as  follows.  We  would  assume  that  fk\k(X\Z^)  is  Gauss-Poisson  and  that,  consequently,  it  can  be  written 
as  a  combinatorial  sum  of  products  of  cfcjft(xi,x2|Z^))  and  Dk\k{yi\Z^).  Then  we  would  apply  the  multitar¬ 
get  Markov  prediction  integral  (equation  29)  to  derive  fk+i\h(X\Z(k))  and  then  compute  the  first-moment  density 
Ac+i|fc(xl-£^)  ar>d  covariance  density  cfc+1|fc(xi,  x2|Z^).  We  already  know  how  to  determine  Dk+i\k{x\Z^) 
exactly  from  equation  35.  It  turns  out  that  cfc+1|ft(xi,x2|Z^))  can  be  computed  by  applying  this  same  equa¬ 
tion  to  both  arguments  of  cfc|fc(xi,x2|Z(fc)).  The  difficulty  comes  with  the  multitarget  Bayes’  rule,  equation  30. 
We  would  apply  the  multitarget  likelihood  and  Bayes’  rule  to  get  fk+i\k+i{X\Z(k+1'))  and  then  try  to  compute 
Ac+i|fc+i(xl'£(fc+1))  c*.+i|fc+i(xi,x2|Z(fc+1)).  However,  the  answer  to  this  symbolic  computation  currently 

eludes  solution.  Moreover,  even  if  it  were  completed  successfully  the  approximate  Bayes’  rule  for  propagating 
Cft+i|fc(xi,x2|Z(fc))  to  cfc+1]fc+1(xi,x2|Z(fc+1))  may  be  so  complex  as  to  be  unusable  in  practice. 

Consequently  another  approach  is  called  for,  as  suggested  in  the  next  section. 


4.3.  Basic  Idea  for  a  Multitarget  EKF 

The  usual  development  of  the  extended  Kalman  filter  [5,  p.  109-11 1]  begins  with  a  sensor  measurement  and  target 
motion  model 


zfc  =  gfc(xft)  +  Vfc,  xfc+1  =  ffc(xfc)  +  Wfc 
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where  ffc  and  g ;fc  are  nonlinear  vector  transformations  and  ~Vk)Wk  are  Gaussian  random  noise  processes.  These 
models  are  then  linearized: 

gfc(x)  =  gk(x.k\k-i)  +  Ak  (x-xfc|fc_i),  ffc(x)  =ffc(xfc|fc)  +Bk  (x-xk\k) 

Afc  =  -^(xfc|fc_l),  Bk  ---  -Q^(x-k\k) 

where  Sh  (x)  =  (x))  denotes  the  Jacobian  matrix  of  the  vector  transformation  h,  evaluated  at  x  =  x. 

\  3  / 

It  is  not  possible  to  directly  extend  this  line  of  reasoning  to  the  multitarget  case.  For  example,  suppose  that 
two  statistically  independent  targets  with  states  xx  and  x2  are  present,  and  that  in  addition  to  the  noise  model 
zfc  =  gfc(xfe)  +  Vfc  the  sensor  has  no  missed  detections  but  its  observations  are  corrupted  by  an  independent 
Poisson  false  alarm  process  Ck.  Then  the  typical  multitarget  observation  would  be: 

Zk  =  (gfc(xi)  +  Vi,i}U{gi(x2)+Vti2}UCfc 

where  Vfc  1,  Vfc>2  are  independent,  identically  distributed  copies  of  Vfc.  Because  of  independence,  this  can  be 
converted  into  a  measurement  model  on  random  densities: 

Ck  —  ^gi(xi)+Vfcil  +  ^gk(X2)+Vfci2 

This  is  a  multitarget  analog  of  the  equation  zk  =  gfc(x)  +  V*,  but  it  does  not  help  us  very  much  since  it  cannot  be 
linearized  in  the  multitarget  state  variable  £  =  SXi  +  SX2. 

If  we  are  to  generalize  the  EKF  approximation  strategy  to  the  multitarget  case,  we  must  therefore  approach 
it  from  a  different  direction.  The  above  nonlinear-Gaussian  models  can  be  equivalently  expressed  in  terms  of  a 
Markov  density  /fc+1|fc(y|x)  and  a  likelihood  function  LZk(x)  -  fk(zk\x)  whose  logarithms  are 

iog/*+n*(y|x)  = 

log  LZk  (x)  =  IcgJVo^OJ-^Zfc-gfcCxjfQfc^Zfc-gfclx)) 


Using  the  EKF  approximation,  the  log-likelihood  can  be  rewritten  as 

log  LZk  (x)  =  log  JVi3fc(0)  _  2  \  -Ak  (x  -  x.k\k-i)  )  Qk  \  ~Ak  (x-x^) 

=  logLZt(xfc|fc_i)  -  (z k  ~  gfc(xfc|fc-i))TQfc1^fc  (x-Xfc|fc_i) 

—  2  (x  —  Xfc|fc-i)  A%QklAk  (x  -  xfc|fc_i) 


or,  equivalently,  as  a  truncated  Taylor’s  series  expansion: 

SlogLzi.  \  ,  1  d2  log  LZk  N 

log  lz.(x)  a  logi,.^,)  +  + 


where 


=  )  =  -xjATtQ~t1Atx2 

(The  logarithm  of  the  Markov  density  can,  of  course,  be  expanded  in  the  same  way.) 


d  log  Lz 
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In  order  to  apply  this  line  of  reasoning  to  the  multitarget  case,  we  would  need  to  be  able  to  generalize  equation 
37  to  a  similar  multitarget  equation 


log LZk{X)  Sf  log LZk{Xk\k_x)  + 


dlogLZk 

d{X-Xk{k_x) 


(^fc|A-l)  + 


d 2  log  Li 


2d(X 


**!*-i)2 


where  Zk  is  the  multitarget  measurement,  X,Xk\k~i  are  multitarget  states,  and  LZk(X)  =  fk(Zk\X)  is  the 
multitarget  likelihood  function.  This  cannot  be  accomplished  because  the  arithmetic  difference  X  -  Xk\k_i  is 
not  defined,  and  because  we  do  not  know  how  to  compute  Frechet  derivatives  §§-(Y)  of  functions  F(X)  of  a 
finite-set  variable  X.  This,  in  turn,  is  due  to  the  fact  that  F(X)  is  ’’inherently  non-differentiable”  because  of 
the  discrete  jumps  in  target  number  that  occur  in  the  argument  X.  In  the  next  section,  I  propose  the  following 
remedy:  extending  LZk(X)  to  a  multitarget  likelihood  functional  LZk[h ]  that  is  differentiable  in  the  function- 
valued  argument  h  with  respect  to  the  Frechet  functional  derivative  (dF/dg)[h\  defined  in  equation  10  of  section 


4.4.  Generalized  Targets:  Point  Track-Clusters 

We  propose  the  following  idea:  extend  the  concept  of  a  point  target  with  state  x  to  that  of  a  point  track-cluster 
with  state  (a,  x).  By  this  I  mean  a  cluster  of  targets,  all  of  which  are  co-located  at  x  and  the  expected  number  of 
which  is  a  >  0.  If  a  =  1  then  the  point  cluster  (1,  x)  models  a  point  target.  A  group  of  point  clusters  is  just  a 
finite  set  of  the  form  X  -  {(aj,  Xl), ...,  (a„,  xn)}.  In  section  2.1.1  we  noted  (equation  8)  that  X  is  mathematically 
equivalent  to  the  density  function  h  =  +  ...  +  an8Xn.  This  identification  allows  us,  in  turn,  to  interpret  any 

function  h{x)  as  a  continuously  infinite  collection  of  point  target- clusters— meaning  that  a  point  track-cluster  is 

located  at  each  point  x  of  state  space  and  the  expected  number-density  of  targets  in  this  cluster  is  h(x). 

Given  this,  suppose  that  we  have  a  single  sensor  with  single-target  likelihood  function  Lz.  (x)  =  fk  (zk  |x)  and 
state-dependent  probability  of  detection  pD(x),  which  we  assume  can  be  extended  to  a  function  pD{a,x)  of  target 
number  a  as  well  as  target  state  x.  Then  I  will  show  how  to  define  a  likelihood  functional  Lz.\h]  =  fk(Zk\h) 
that  extends  the  multitarget  likelihood  function  LZk(X )  =  fk(Zk\X)  and  which  describes  the  statistics  of  the 
o  servation-sets  Zk  generated  by  all  targets  in  the  continuously  infinite  group  h  of  point  target-clusters.  This  then 
leads  to  the  desired  multitarget  generalization  of  equation  37: 


l°g LZk (X)  —  log LZk(Xk^k_fi  -f 


d  log  Lz, 


1  ^  log  LZk 


d^x~%k-X  +  2d(5x  -6 *  ‘ 


where  the  sum  on  the  right  is  a  Taylor’s  series  expansion  in  terms  of  the  general  functional  derivatives  (equation  11) 
introduced  in  section  2.1.3. 


Note:  For  present  purposes,  I  am  not  proposing  the  use  of  the  multitarget  likelihood  functional  Lz  [h]  in  place 
°/’ th"  ‘ :  Ukelih°°d  funCti°n  Lz^X)-  ™S  WOuld  entail  the  placement  of  multitarget  density  functions 

f  t,  f  by  multitar§et  density  functionak  fk]k[h\Z^}  and  the  use  of  functional  nonlinear  filtering  equations 


fk+i\k[h\zW]  =  J fk+1]k[h\g }  fk]k[g\ZW}Vg  (39) 

A(Zt+liz<*))  =  /iz,+lWA+1|tlh|^),m 

where  /  Vh  is  a  suitable  multi-dimensional  generalization  of  theWiener  junctional  integral  [40,  pp.  159, 186-188], 
[8],  [31],  This  is  because  we  are  searching  for  estimates  that  are  true  multitarget  states  h  =  6^  <*=>  Xk,k  rather 
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than  abstract  functions  (PHDs)  hk\k.  If  we  modeled  multitarget  systems  using  PHD’s,  on  the  other  hand,  then 
equations  39  would  be  the  appropriate  theoretical  approach. 

I  also  show  that  the  likelihood  functional  can  be  approximated  as  a  functional  Gaussian  distribution 


Lz(X)  =  feM  •  exp  ^  j  (e*(y)cr2(yh)°(y-dy') 
if  there  is  a  unique  function  h=  ho  that  solves  the  implicit  differential  equation 

Hy)-j^-(h(y)>y)  +pd(K  y),y)  =  o 


and  if,  in  addition,  the  function 


l  _  2Pp{h o(y),y) 

<r2(y)  h0(y)2 


d^PD 
da 2 


(^o(y),y)  >  o 


(40) 


(41) 


is  integrable  and  positive-valued  for  all  y. 

In  more  detail:  Suppose  that  we  are  given  a  sensor  with  single-target  measurement  model  z  =  g(x)  +  W, 
likelihood  function  /(z|x)  =  /w(z  -  g(x)),  and  state-dependent  probability  of  detection  pc(x).  Suppose  that 
T  is  a  Poisson  point  track-cluster  with  intensity  function  D(y)  where  a  =  f  D{y)dy  >0  is  the  expected  number 
of  targets.  Also,  assume  that  probability  of  detection  is  a  function  p;;(a,x)  of  target  number  a  as  well  as  target 
state  x,  with  pD(  1,  x)  =  pD(x).  Then  it  is  easily  shown  (see  section  5.4)  that  the  multitarget  observation-set  is 
also  Poisson  and  that  its  multitarget  likelihood  function  is 


f(Z\a,x)  =  e  °'PD(a’x)  (a-pD(a,x)  ■/( z|x)) 
zez 


Suppose,  next,  that  we  have  a  collection  X  —  {(ai,xi), ...,  (a^Xn)}  of  several  independent  point  target-clusters. 
Then  the  multitarget  likelihood  function  is  again  Poisson  and  is  (see  section  5.5): 

n  n 

f(Z\X)  =  e~ax  D(x\X),  D{ z\X)  =  J]aipD(ai)xi)/(z|xf),  ax  =  Y^,aiPD{ai,*i) 

z  ez  »=i  *=1 

By  analogy,  suppose  that  the  number  n  of  point  clusters  increases  continuously  without  bound  in  such  a  manner 
that  the  empirical  distribution  <5Xl  +  ...  4-  anSXn  converges  to  some  positive-valued  function  h.  Then  h  can 
be  interpreted  as  a  continuously  infinite  family  of  point  track-clusters,  such  that  the  point  track-cluster  located  at 
x  contains  an  average  number  (density)  fi(x)  of  targets.  The  expected  number  of  targets  over  the  entire  space 
is,  therefore,  ah  =  f  h(x)clx  if  the  integral  exists.  By  analogy,  say  that  the  likelihood  Junctional  describing  the 
observations  generated  by  all  targets  is 

Lz[h\  =  f(Z\h)  =  []  DM 

z  ez 

where 

Dz[h]  =  /  h(x)  pD(h{x),x)  /(z|x)dx,  a[h\  —  J  h(x)  pD(h(x),x)dx 
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Notice  that  log  Lz[h]  is  continuous  resp.  differentiable  in  h  if  pD(a,x)  is  continuous  resp.  differentiable  in 
a.  In  fact,  application  of  the  functional  derivative  (equation  10)  and  a  little  algebra  shows  (see  section  5.6)  that 


6-^-[h\  =  (h(y)^-{h(y),y)+pD{h(y)iy)Sj  f(z\y) 

^[h]  =  h{y)^(h(y),y)+pD(h{y),y) 


<5  log  Lz 


[h)  = 


+  (, 


My>'-^<My),y)  +  pn(h(y),y) 


In  like  manner,  a  little  more  algebra  shows  that  the  second  derivative  of  the  log-likelihood  is 


<i>2  log  Lz 
<5y2<5yi 


[h]  = 


-  (E/(z|ggly2?)  (MyI)^(MyI),y1)+PD(My.),y1)) 

•  (h(y2)^-(h(y2),y2)  +  PD(h(y2),y2)^ 

+  (a^  (Ky.)^(MyO.yi))  +^MMyi),y.)) 


where  we  make  use  of  the  identity 


^•/(h(u),  v)  =  ^(h(u),  v)<5y(u) 


Equations  40  and  41  directly  follow  from  these  results.  For,  if  40  holds  for  some  h  —  ho  then 


8  log  Lz 


[ho]  =  0 


for  all  y  and  so 


log  Wfc]  -  logLaN  = 


If  h  is  sufficiently  near  ho,  the  second-order  term  on  the  right-hand  side  of  the  equation  dominates  all  higher-order 
terms,  and  the  function  ho  corresponds  to  an  absolute  maximum  if  and  only  if  the  right-hand  side  of  the  equation 
is  always  negative.  This  can  be  true  if  and  only  if  equation  41  is  true  identically. 


5.  APPENDIX  A:  MATHEMATICAL  PROOFS 


5.1.  Proof: 


We  are  to  show  that  Pr(S)  =  GY  [Is]-  We  know  that  the  multi-object  density  fr{X)  of  the  point  process  T  is  the 
same  thing  as  the  corresponding  family  of  its  Janossy  densities:  /r({xi, ...,  xk})  =  jY(xi> ....  xfc).  So, 

oo  i  . 

Gt [Is]  =  Xfci  /  ls(xi) ' ' '  Isfck)  Jr(xi,  ...,Xfc)dxi  •  •  •  dxk 
k= o  ^ 

=  X  £1  Js„  /r((xi.  •••> Xft})dxi '  ’  dxit  =  jf  fr(X)8X  =  /%•(£) 


161 


5.2.  Proof: 

We  are  to  show  that 


6Pr(q\-6G rf1  i 


Let  S  be  a  closed  set,  let  Ex  denote  an  arbitrarily  small  region  surrounding  the  vector  x,  and  let  A (Ex)  denote 
its  hypervolume  (Lebesgue  measure).  Our  proof  will  be  informal  and  is  based  on  the  observation  that,  in  the  limit 
as  A (Ex)  ->  0,  the  characteristic  function  lEx  strongly  resembles  a  Dirac  delta  function  that  has  been  multiplied 
by  a  very  small  postive  number: 

lfi,(y)  =  HE.)  ■  —  A(f»)  .  «y) 

Suppose  that  both  the  set  derivative  and  the  functional  derivative  ^  exist.  Then  by  the  definition  of  a  set 
derivative  (equation  22,  [22,  p.  30],  [9,  pp.  145-146, 150]): 

S/3r  _  /3r{SuEy)  -f3r(S)  ..  Gr[lsugy]  - gr[ls] 

~  x(i?\o  A  (Ey)  HEy)\o  A  (By) 

,.  +  l£;y]  —  Gpfls]  .  Gr[ls  +  A(J5y)<5y]  —  Gt[1s] 

A(£y)\0  A  (Ey)  l(By)\0  A  (Ey) 

gr[ls  +  £^y]  -  <?r[ls]  _  lim  Grjls  +  e<M  -  Gr[lg]  _  £Gr 


5.3.  Proof: 


We  are  to  show  that  K{fk\k;fk\k)  is  minimized  if  Dk]k(x)  =  Dklk{x.\Z^).  First,  notice  that 


K(fklk,fk\k)  =  I  /fc|fc(X|Z(fc))log/fc,fc(X|zW)5X-.|/,|fc(X|Z(fc))log  n^W) 

=  const.  +  Nk |fc  -  J  (fk]k{X\ZW)  log£)fc|fc(x)^  5X 


However,  a  bit  of  algebra  shows  that 

J  ^fk\k{X\Z^)  log  £>fc|fc(x)^  6X  =  Nk\k  + 

where  sfc|fc(x|ZW)  =  Dk\k{x\Z^)/Nk\k  and  4|fc(x|^(fc))  =  Dk\k^\Z{k))/ Nk\k.  So, 

K{fk\k\h\k)  =  const.  —  Nk\k  J sklk(x\zW)\ogsklk(x)dx  +  Nklk  -  Nk[k\ogNk]k 
=  const.  +  Nk\kK(sk\k,sk\k)  +  Nk\k  -  Nk\k  log  iVfc|fc 

This  is  minimized  when  K(sk\k,sklk)  and  Nk\k  -  Nk jfc  log iVfcj*  are  separately  minimized,  i.e.  when  sk \k  =  sk\k 
and  Nk]k  =  Nk]k. 
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5.4.  Proof 

We  are  given  a  sensor  with  single-target  measurement  model  z  =  g(x)  +  V,  state  space  S,  likelihood  function 
/(z|x)  =  /v(z  -  g(x)),  state-dependent  probability  of  detection  pu(x),  and  a  Poisson  track-cluster  T  with 
intensity  function  D(y )  where  a  =  f  D(y)dy  >0  is  the  expected  number  of  targets.  Suppose  further  that  F  is 
a  Poisson  point  track-cluster  and  that  the  probability  of  detection  pn(a,  x)  is  a  function  of  average  target  number  a 
as  well  as  of  target  state.  We  are  to  show  that  if  the  track-cluster  is  a  point  track-cluster,  then  the  observation-process 
generated  by  it  is  a  Poisson  process  with  intensity  function  D(z\a,  x)  =  a  •  p£»(a,  x)  •  /(z|x). 

Let  v  be  a  Poisson-distributed  random  non- negative  integer  with  parameter  a  and  let  X1(  ...  be 

independent,  identically  distributed  (i.i.d.)  random  vectors  with  distribution  s(x)  =  D(x) /a.  From  the  footnote  in 
section  3.2,  we  know  that  T  =  {Xi, ...,  X„}  is  a  Poisson  track-cluster  with  the  specified  characteristics.  Let  0? 
be  a  random  set  which  takes  only  the  two  values  0,  S  with  Pr(0?  =  0)  =  1  —  q  and  Pr(0?  =  S )  =  q.  Then  the 
observation-set  £  produced  by  the  track-cluster  is 

£  =  U  ({g(Xi)  +  VO  n 

t=l 

where  Vi,  ...,Vi, ...  are  i.i.d.  The  corresponding  belief-mass  function  is 

oo  n 

MS)  =  Pr(£C5)  =  X;Mn)nPr({g(x0+Vi}n^(a-Xi)c5) 

71=0  1=1  ' 

oo  n 

=  E^(n)IIPrfe(X*)+V*>n^(a’Xi)  C51) 

n=0  i=l  ' 

However,  assuming  independence  of  all  random  quantities  we  get 

Pr  ({g(Xi)  +  Vi}  n  0^(a’Xi>  C  s)  =  J  Pr  ({g(y)  +  Vf}  n  ^ a’y)  C  s)  s(x)dx 

=  J  C1  -  PD(a ,  y)  +  pD(a,  y)  Pr  (g(y)  +  V{  G  S ))  s(y)dy 
=  J  (1  ~PD(a,y)  +Pv(a,y)  Pz(s\y))  s(y)dy 

where  pz(S \y)  =  fsf(z\y)dz.  So, 


Ms)  =  5^Pi/(n)  (  /(I  ~Pd{y)  +  Pd{y)  Pz(S\y))  s(y)dy) 

n=0  / 

=  exp  (“/  (1  -  Pd( y)  +  Pd{ y)  Pz(5|y))  s(y)dy  -  a\ 


This  is  the  belief-mass  function  of  a  Poisson  observation-cluster.  Now  assume  that  the  track-cluster  is  a  point  cluster, 
i.e.  s(y)  =  <5x(y).  Then  the  belief-mass  function  becomes 


/^(Sla.x)  =  exp(ap£>(a,x)  (pz(5|x)  -  1)) 

This  is  the  belief-mass  function  of  a  Poisson  process  with  intensity  function  D(z\a,  x)  =  a  ■  pD(a ,  x)  •  /(z|x),  as 
claimed. 
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5.5.  Proof: 

We  are  to  show  that 

n  n 

f(Z\X)  =  e~ax  D{ z\X),  D( z\X)  =  ^aipD(ai>xi)/(z|xi),  ax  =  s£jaipD[ai^) 

z ez  *=i  *=1 

The  belief-mass  function  of  each  point-Poisson  process  is 

Pi(S)  =  exp  ( aipD(a,i ,  Xi)pz(5|xi)  -  ajpr>(ai,Xi)) 

Since  the  processes  are  independent,  the  belief-mass  function  of  their  union  is  also  that  of  a  Poisson  process. 


f3(S)  =  pi (S)  ■  --Pn(S)  =  exp  (  5\ipi?(ai,Xi)pzOS'|x;)  -  ^Ojpzjfa.Xi)  J  =  exp(D(z|#)  -  ax) 


as  claimed. 

5.6.  Proof 

We  are  to  show  that 


=  6>(  y)^(k(y).y)+PD(ft(y).y))/(*|y) 

~{hl  =  h(y)^(h(y),y)  +pD(My),y) 


51og  Lzr  , 
“ly- 1  1  = 


-i+E 


/(z|y> 

Dz[h] 


^(y)-|^(/i(y),  y)  +  PD(h(y),  y) 


The  second  equation  follows  easily  from  the  first,  and  the  third  follows  from  the  first  and  second  since 

*logiz-M= -&W + 


So,  we  need  only  prove  the  first  equation: 

8DZ  rtl  Dz[h  +  e6y]  -  Dz[h)  f  fv_  1  {h  +  e8y)(x.)pD(h(x)  +  efiy(x),x)/( z|x)  ^ 

I7W  =  J™„ - 7 - =  A~°«  -fc(x)pD(k(x).x)/(2|x)  A 

f  1  h(x)pD{h(-x.)  +  e<5y(x), x)/(z|x)  +  +  e<5y(x), x)/(z|x)  \ 

=  J{^o-e  -*(x)w>(fc(x),x)/(  z|x)  r3 

f  f  1  /i(x)po(/i(x)  +  e<5y(x),x)/(z|x)  -  /i(x)p2j(/i(x),x)/(z|x)  N 
J  +£<5y(x)pz3(h(x)  +  e5y(x),x)/(z|x)  7 


Therefore 


W  =  /Kx)/(,|x)  * 

+  /  lim  6y(x)  pD(h(x)  +  e<5y(x),x)  /(z|x)dx 

J 

=  y'/i(x)/(z|x)5y(x)^(h(x),x)dx  +  y"(5y(x)pi3(Ax),x)  /(z|x)cb 


My)- -J^(My).y)  +  pc(My)»y)  J  /(zly) 
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6.  CONCLUSIONS 

In  this  paper  I  introduced  multitarget  moment  statistics  of  arbitrary  order,  and  have  tried  to  illustrate  their  potential 
importance  for  multitarget  tracking.  I  emphasized  the  importance  of  point  process  theory  to  multisensor-multitarget 
detection,  tracking,  and  identification.  Despite  its  neglect  in  multitarget  tracking  circles,  this  theory  is  the  statistical 
foundation  for  all  multi-object  problems,  and  its  importance  to  future  theoretical  and  practical  research  should  not 
be  underestimated.  The  point  process  literature  contains  a  vast  array  of  results,  techniques,  and  computational 
approaches  that  await  ’’mining”  for  application  to  multitarget  problems.  Towards  that  end,  I  have  provided  a  brief 
but  systematic  introduction  to  point  process  theory,  and  in  particular  to  the  ’’engineering-friendly”  version  of  it  that 
I  call  finite-set  statistics  (FISST). 

I  showed  how  a  viable  Bayes  nonlinear  filter  can  be  devised  for  the  first-order  multitarget  moment  (the  ’’probabil¬ 
ity  hypothesis  density”  or  ’’PHD”),  thus  yielding  a  computable,  systematic  multitarget  tracking  approach  that  avoids 
report-to-track  association.  I  described  preliminary  simulations  that  suggest  its  possible  benefits  and  possible  chal¬ 
lenges  in  practice.  Chief  among  the  limitations,  other  than  the  necessity  for  high  signal-to-noise  ratio,  is  the  need 
for  good  multi-peak  extraction  algorithms.  This  is  because  multi-peak  extraction  largely  replaces  report-to-track 
association  in  a  PHD  multitarget  filter.  I  also  pointed  out  an  open  research  question:  the  problem  of  defining  a 
track-valued  (rather  than  density-  or  measure- valued)  first-order  multitarget  moment  statistics. 

I  initiated  a  study  of  the  second-order  statistics  ofmultitarget  systems.  While  this  study  is  preliminary,  it  suggests 
possible  directions  for  research.  I  described  the  second-order  moment  and  covariance  statistics  of  a  multitarget 
system,  and  indicated  why  it  does  not  appear  possible  to  develop  a  computationally  tractable  multitarget  tracking 
filter  based  on  these  statistics  at  this  time.  However,  I  also  described  a  potential  path  to  a  multitarget  generalization 
of  the  Kalman  filter.  This  approach  depends  upon  (1)  being  able  to  extend  any  multitarget  likelihood  function  to 
a  so-called  ’’multitarget  log-likelihood  functional,”  and  (2)  expanding  this  functional  in  a  Taylor’s  series,  using  a 
generalization  of  the  FISST  set  derivative  called  the  functional  derivative. 
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